microsoft / LLMLingua

To speed up LLMs' inference and enhance LLM's perceive of key information, compress the prompt and KV-Cache, which achieves up to 20x compression with minimal performance loss.
https://llmlingua.com/
MIT License
4.27k stars 228 forks source link

Index error for small token amounts #84

Closed oz03-hub closed 5 months ago

oz03-hub commented 5 months ago

Hello again, I noticed a strange behavior while developing using llmlingua, for small prompts such as "hello" or "who", compress function throws an error for index 0 out of range.

` """ Token compression using llmlingua that uses gpt-2 small llm. """

llm_lingua = PromptCompressor(model_name=optimizer_model, device_map=device_map)

try:
    print(user_prompt)
    print("Compressing")
    compressed_prompt = llm_lingua.compress_prompt(
        context=[user_prompt],
        ratio=0.2,
    )

`

here is initialization to trigger error, thanks and I am looking forward to updates

oz03-hub commented 5 months ago

Same behavior also happens in hugging face demo as well

iofu728 commented 5 months ago

Thank you for pointing this out. we will fix the issue ASAP.

iofu728 commented 5 months ago

Fixed in #87