microsoft / LLMLingua

To speed up LLMs' inference and enhance LLM's perceive of key information, compress the prompt and KV-Cache, which achieves up to 20x compression with minimal performance loss.
https://llmlingua.com/
MIT License
4.27k stars 228 forks source link

Issues with reproducing LongLLMLingua on the LongBench dataset. #94

Open yunlongia opened 5 months ago

yunlongia commented 5 months ago

Really appreciate your fascinating work!

There is no documentation in the examples regarding reproducing the paper's results on the LongBench dataset. Is there any plan to release the scripts used to evaluate LongLLMLingua on LongBench?

I would like to understand some implementation details, including how to use LongLLMLingua for compressing long texts in the LongBench dataset, as well as how to use LongChat with compressed context for inference.

iofu728 commented 5 months ago

Hi @yunlongia,

Thank you for your interest and support.

For the compression parameters, you can follow the guidelines at https://github.com/microsoft/LLMLingua/blob/main/Transparency_FAQ.md#how-to-reproduce-the-result-in-llmlingua--longllmlingua.

For the LongChat experiments, we used the code base from https://github.com/nelson-liu/lost-in-the-middle/blob/main/scripts/get_qa_responses_from_longchat.py, with the only modification being that the input data was changed to the compressed prompt.

yunlongia commented 5 months ago

Thank you for your response. I'm still not clear on some details. LongLLMLingua includes Question-Aware Coarse-Grained Compression for each document. How to split a long context into multiple documents for each dataset in the Longbench benchmark?

yunlongia commented 4 months ago

@iofu728

iofu728 commented 4 months ago

Hi @yunlongia, we plan to release these scripts in the upcoming weeks. Initially, you can try splitting the context using "\n\n" or another delimiter.

yunlongia commented 1 month ago

hi, Do you currently have a demonstration on how to perform document segmentation on LongBench?