microsoft / LLMLingua

To speed up LLMs' inference and enhance LLM's perceive of key information, compress the prompt and KV-Cache, which achieves up to 20x compression with minimal performance loss.
https://llmlingua.com/
MIT License
4.42k stars 241 forks source link

Failed Compression Attempts with LLMLingua Web UI Demo #60

Open cws322 opened 7 months ago

cws322 commented 7 months ago

I conducted several prompt compression tests using the LLMLingua Web UI Demo (https://huggingface.co/spaces/microsoft/LLMLingua ). However, I encountered a situation where the context could not be compressed, despite testing with various compression ratios. image Could the inability to compress the context be attributed to a model-related issue?

iofu728 commented 7 months ago

Hi @cws322, in the HF Demo, LLMLingua assigns different compression ratios to different parts of the prompt, like instruction, contexts and question. Therefore, please place the content you wish to compress in the context section.

I hope this resolves your issue.