Closed vikram71198 closed 6 months ago
Hi @vikram71198, thank you for your interest in LLMLingua.
keep_split
to retain the "\n\n" separators between each dialogue. After obtaining the compressed prompt, you can split the dialogue based on "\n\n" and restore the speaker information.The example given in your notebook is very helpful for QA tasks, where the answer does not have to be aggregated across multiple portions of the context & is present in a definite, limited text span within the context.
What I mean is, given a conversation between agent & customer at a customer service center, if we wanted to figure out how satisfied the customer is with their issue resolution, the compression technique outlined in that notebook would not really work because this answer is supposed to be an aggregation over multiple portions of the dialogue.
This is just a suspicion, I'm yet to test this out.
Hi @vikram71198,
Yes, scenarios similar to RAG, where the answer appears directly in the text, are highly suitable for our method. However, we have also tested more complex scenarios, such as multi-hop QA and other tasks requiring global information. Our method employs a coarse-to-fine approach, not only using the coarse level to eliminate irrelevant documents/segments but also performing compression at a finer granularity. This mechanism allows us to perform well in tasks that require global information.
Hi,
Thanks for this amazing piece of work. I was trying to use this framework to compress a prompt, which has a dialogue between two people as context & I was trying to compress the dialogue alone. I leave the instruction & question uncompressed.
So, far, even with low compression ratios like 0.1-0.15, I'm seeing significant deviation in outputs for the compressed prompt in comparison to the original, uncompressed prompt. In fact, the compressed prompt spitted out also tends to be unintelligible in quite a few places. I was using the same params as you do here, although I'm not entirely sure what
context_budget
does exactly.Also, I currently pass the dialogue in as a
str
. Would it make any difference segmented it line by line and passing it in as aList[str]
?The dialogue has speaker roles like
Agent:
&Customer:
that are dropped sometimes after compressing, is there a way I can make sure some tokens are never dropped? I'm guessing you do that usingforce_context_ids
? Does this param take input_ids after tokenizing? I'm confused.Do you have any suggestions on what would be a good/optimal param setting to compress dialogues?