microsoft / LLMLingua

To speed up LLMs' inference and enhance LLM's perceive of key information, compress the prompt and KV-Cache, which achieves up to 20x compression with minimal performance loss.
https://llmlingua.com/
MIT License
4.48k stars 251 forks source link

[Question]: Markdown table compression #126

Closed ZhexuanZhou closed 5 months ago

ZhexuanZhou commented 6 months ago

Describe the issue

Did you test the QA performance on the compressed Markdown table? In my case, the compression breaks the structure of the Markdown table, leading to the LLM not answering the question properly. While the answer is as expected before.

My task is RAG, do you have any advice for compressing documents with tables?

iofu728 commented 6 months ago

Hi @ZhexuanZhou, thanks for your interest in LLMLingua.

Currently, there are two methods to preserve structured data.

  1. If you're using LLMLingua-2, you can pass force_tokens when calling compress_prompt:
    compressed_prompt = llm_lingua.compress_prompt(prompt, rate=0.33, force_tokens=['|', "-"])
  2. If you're using LLMLingua or LongLLMLingua, you can utilize Structured Prompt Compression. For more details, refer to https://github.com/microsoft/LLMLingua/blob/main/DOCUMENT.md#structured-prompt-compression
structured_prompt = """<llmlingua, compress=False>|</llmlingua><llmlingua, rate=0.4> Method</llmlingua><llmlingua, compress=False>|</llmlingua>"""
compressed_prompt = llm_lingua.structured_compress_prompt(structured_prompt, instruction="", question="", rate=0.5)