Closed npuichigo closed 10 months ago
We have the same issue. It seems probably related to exceeding context maximum length
could you provide steps to reproduce this issue? this will be helpful for debugging.
We have the same issue. It seems probably related to exceeding context maximum length
This error occurs when I use LangChain + VectorStore to provide much retrieve context for LLM generation. So, it seems the LLM context is too long, but the error is not obvious and just report a python syntax issue.
@merrymercy
same issue. I encounter this with llama2-chat-7b. The input length + max_new_tokens is around 2500 tokens, less than the 4096. Also, I was able to run it successfully with Vicuna 1.5-7b. They are both based on the llama2.
I may have found the cause of the error, which is caused by max_position_embeddings in config.json. For my problem, it was the config.json that was causing it. The config downloaded from the LLAMA2 repository has max_position_embeddings = 2048 (which if for llama1). When I change it to 4096, it works. @npuichigo
@zchuz Thanks for your info. I also think it's caused by the max length restriction of model. But FastChat should yield better error response and this line is actually a python bug UnboundLocalError: local variable 'stopped' referenced before assignment
@merrymercy
i encountered same issue and i am using fastchat-t5-3b, but do not find where to change the max_position_embeddings in config
I think
UnboundLocalError: local variable 'stopped' referenced before assignment
is a bug in code