Closed acnagle closed 2 months ago
Hi @acnagle, Thank you for your question. At the segment level, we only use the condition_in_question
parameter, which specifies whether the question is positioned before or after the context.
At the token level, we only use the condition_compare
parameter to choose between using perplexity or conditional perplexity.
Therefore, with condition_in_question=after
and condition_compare=True
, the compression of segment $P(context|question)$ is not based on the question.
Yes, we will still segment the context and use a method similar to equation (3). Also, the condition_in_question
parameter does not take effect at the token level; it is controlled solely by condition_compare
.
Describe the issue
I saw #103 asked a similar question, but I'm not sure I understand how this works with respect to Equation 5 from the first LLMLingua paper. If I have a query with
condition_in_question=after
andcondition_compare=True
, then my understanding is that this means that the probability of a compressed segment will not be actually conditioned on the query since the context (which is the thing we are compressing) comes before the query appears. I know this is probably not actually a problem in the code implementation, but I don't fully understand the implementation and I'm trying to connect between what the paper says and what the code is doing.I see that Equation 3 in the LongLLMLingua uses the contrastive complexity to score each token, but are we still first segmenting the context and then pruning tokens from the context in a similar way as the original LLMLingua paper? Based on Equation 3, I'm confused how to properly condition on the query. So, regardless of what
condition_in_question
is, do we put the query the LLM context before the rest of the context in order to compute the ppl on x_i?Any help is greatly appreciated!