microsoft / LLMLingua

To speed up LLMs' inference and enhance LLM's perceive of key information, compress the prompt and KV-Cache, which achieves up to 20x compression with minimal performance loss.
https://llmlingua.com/
MIT License
4.42k stars 241 forks source link

"IndexError: list index out of range" when compressing prompt #14

Closed elanger4 closed 9 months ago

elanger4 commented 9 months ago

Hello,

I have been testing this library to be potentially included in our product & have constructed this small example to demonstrate the error I've repeatedly been getting:

from llmlingua import PromptCompressor

# Running with 'cpu'; on Mac without GPU
llm_lingua = PromptCompressor(device_map="cpu")

instruction = "You are a chatbot designed to answer questions about AI & ethics"
prompt = "Provide an in-depth exploration of the role of AI in enhancing the security and effectiveness of educational technology. Discuss the potential risks, such as data breaches and biased algorithms, the ethical considerations in using AI to monitor and assess student performance, and the best practices for safeguarding student data and privacy while leveraging AI to personalize and enhance the learning experience."

compressed_prompt = llm_lingua.compress_prompt(
    context = [],
    instruction = instruction,
    question = prompt
)

print (compressed_prompt)

I get the following error:

Traceback (most recent call last):
  File "/Users/PATH/research/test_prompt_compression.py", line 9, in <module>

  File "/Users/PATH/Library/Python/3.9/lib/python/site-packages/llmlingua/prompt_compressor.py", line 224, in compress_prompt
    context = self.iterative_compress_prompt(
  File "/Users/elangert/Library/Python/3.9/lib/python/site-packages/llmlingua/prompt_compressor.py", line 776, in iterative_compress_prompt
    for delta_end, ratio in iterative_ratios[idx]:
IndexError: list index out of range

I have tested this with prompts up to 200 tokens (& as low as 10 tokens) & have tried setting iterative_size as low as single digits - all settings result in the same error above.

Please let me know any additional information you may need from me to run this down.

Thanks!

iofu728 commented 9 months ago

Hi @elanger4 ,

Thank you for identifying the issue. This bug has already been resolved in the most recent PR #15. You can try updating LLMLingua to the latest version and try again.

However, the current code does not compress the instruction and question part. If you need to compress the corresponding content, please place it in the context. I hope this is helpful to you!

from llmlingua import PromptCompressor

# Running with 'cpu'; on Mac without GPU
llm_lingua = PromptCompressor(device_map="cpu")

instruction = "You are a chatbot designed to answer questions about AI & ethics"
prompt = "Provide an in-depth exploration of the role of AI in enhancing the security and effectiveness of educational technology. Discuss the potential risks, such as data breaches and biased algorithms, the ethical considerations in using AI to monitor and assess student performance, and the best practices for safeguarding student data and privacy while leveraging AI to personalize and enhance the learning experience."

compressed_prompt = llm_lingua.compress_prompt(
    context = [prompt],
    instruction = instruction,
    question = "",
    ratio = 0.4
)
print (compressed_prompt)

> {'compressed_prompt': 'You are a chatbot designed to answer questions about AI & ethics\n\nProvide andepth of the of incing the security and of educational technology.uss the potentialks, as data and, theations inI to monitor and the forarding student data and privacy while leveraging AI to personalize and enhance the learning experience.',
 'origin_tokens': 82,
 'compressed_tokens': 64,
 'ratio': '1.3x',
 'saving': ', Saving $0.0 in GPT-4.'}

Thank you for your support again!

elanger4 commented 9 months ago

Hi @iofu728,

Thank you for your quick response!

Confirming that shifting prompt into the context list works as intended now.

I pulled the documentation from here & I still don't quite understand why the data is put into those parameters. I might suggest slightly updating the verbiage for those parameters in your documentation to be slightly more verbose.

Thank you again for your help & developing this library!

Closing issue.