microsoft / LLMLingua

To speed up LLMs' inference and enhance LLM's perceive of key information, compress the prompt and KV-Cache, which achieves up to 20x compression with minimal performance loss.
https://llmlingua.com/
MIT License
4.18k stars 222 forks source link

[Question]: Token indices sequence length is longer than the specified maximum sequence length for this model (614 > 512). Running this sequence through the model will result in indexing errors #165

Open lifengyu2005 opened 3 weeks ago

lifengyu2005 commented 3 weeks ago

Describe the issue

I use the following configuration, why is it throwing an error? I see a lot of 512 configurations in the llmlingua installation path. Do I need to retrain the model, or is it an issue with the llmlingua version?

self.model_compress = PromptCompressor( model_name="/xxx/llmlingua/llmlingua-2-xlm-roberta-large-meetingbank", use_llmlingua2=True, # Whether to use llmlingua-2 llmlingua2_config={ "max_batch_size": 100, "max_force_token": 4096, }, )

llmlingua ver 0.2.2

iofu728 commented 2 weeks ago

Hi @lifengyu2005, thanks for your support. These logs appear to be warnings. Did your program crash because of these warnings? Please provide more details to help us identify the issue.