Open yunlongia opened 5 months ago
Hi @yunlongia,
Thank you for your interest and support.
For the compression parameters, you can follow the guidelines at https://github.com/microsoft/LLMLingua/blob/main/Transparency_FAQ.md#how-to-reproduce-the-result-in-llmlingua--longllmlingua.
For the LongChat experiments, we used the code base from https://github.com/nelson-liu/lost-in-the-middle/blob/main/scripts/get_qa_responses_from_longchat.py, with the only modification being that the input data was changed to the compressed prompt.
Thank you for your response. I'm still not clear on some details. LongLLMLingua includes Question-Aware Coarse-Grained Compression for each document. How to split a long context into multiple documents for each dataset in the Longbench benchmark?
@iofu728
Hi @yunlongia, we plan to release these scripts in the upcoming weeks. Initially, you can try splitting the context using "\n\n" or another delimiter.
hi, Do you currently have a demonstration on how to perform document segmentation on LongBench?
Really appreciate your fascinating work!
There is no documentation in the examples regarding reproducing the paper's results on the LongBench dataset. Is there any plan to release the scripts used to evaluate LongLLMLingua on LongBench?
I would like to understand some implementation details, including how to use LongLLMLingua for compressing long texts in the LongBench dataset, as well as how to use LongChat with compressed context for inference.