PromtEngineer / localGPT

Chat with your documents on your local device using GPT models. No data leaves your device and 100% private.
Apache License 2.0
20.02k stars 2.23k forks source link

Input length greater than the max_length #307

Open SquidCubic opened 1 year ago

SquidCubic commented 1 year ago

I‘m using GPU with the model below: model_id = "TheBloke/Llama-2-13B-chat-GPTQ" model_basename = "gptq_model-4bit-128g.safetensors"

I use 10 pdf files of my own (100k-200k each) and can start the model correctly

However, when I enter my query, I always get the error below, no matter what questions I ask. Input length of input_ids is 3823, but max_length is set to 2048. This can lead to unexpected behavior. You should consider increasing max_new_tokens.

Does anyone know why this is? Thank

andypotato commented 1 year ago

Same issue here using the 7B Llama2 model, any ideas?

SquidCubic commented 1 year ago

I think the reason for the error is that the pdf document is not in English, so more tokens are used when tokenizing. I'll try changing to a multilingual model. Please check this video for details: https://www.youtube.com/watch?v=ThKWQcyQXF8

rlleshi commented 1 year ago

@SquidCubic llama2 is actually multilingual. Did you perhaps find a better fit?