Closed majestichou closed 8 months ago
Yup, same issue I had here: https://github.com/oobabooga/text-generation-webui/issues/5374
The problem is that it's truncating the prompt to make room for the max new tokens and so it's removing messages until it can't. I dislike this behaviour, it seems extremely counterintuitive, but is apparently expected/acceptable
Offered a solution: https://github.com/oobabooga/text-generation-webui/pull/5382
If you're willing to merge that and make your own install, you can try, otherwise just reduce max new tokens as you have
Yup, same issue I had here: #5374
The problem is that it's truncating the prompt to make room for the max new tokens and so it's removing messages until it can't. I dislike this behaviour, it seems extremely counterintuitive, but is apparently expected/acceptable
Offered a solution: #5382
If you're willing to merge that and make your own install, you can try, otherwise just reduce max new tokens as you have
thank a lot! great job!
Describe the bug
I used webui to load the meta-llamaLlama-2-70b-chat-hf model and set the Instruct template to llama-v2. Then I set the preset to Divine Intellect in the Generation Tab and changed "max_new_tokens" to 4096. I chose the "chat-instruct" mode. Then I went to the chat tab and asked codellamaCodeLlama-70b-Instruct-hf some questions. The answers were ridiculous and no matter what question I asked, AI answered exactly the same. .My conversation with AI is as follows:
When I set ""max_new_tokens" to 2000, everything was ok.
I don't know if the problem is because of meta-llamaLlama-2-70b-chat-hf or text-generation-webui. Does meta-llamaLlama-2-70b-chat-hf limit max new tokens? Or is there a bug in text-generation-webui?
Is there an existing issue for this?
Reproduction
Screenshot
nothing
Logs
System Info