Closed BenjaminGantenbein closed 6 months ago
You should be able to turn off skip_special_tokens
in the UI and set <|eot_id|>
as a stop condition. Alternatively, changing eos_token_id
from 128001 to 128009 in config.json. If this doesn't work then there's something else going on.
Do you have any idea what format that model was finetuned for? Is it finetuned with the Llama3-instruct template? Or finetuned from Llama3-instruct to some other format using extra tokens that aren't merged properly? It's hard to speculate as to why it's not working without those details.
Thanks for the quick reply. I was using sharegpt format, but didn't add the eos token in the configuration file of axolotl. I guess this is the issue.
Hi!
First of all thanks for the nice repo!
I tried already many different proposed solutions from here:
https://github.com/oobabooga/text-generation-webui/issues/5885
but I always get either answers that are not stopping, or answer that are decorated with "ASSISTANT: Hi I am the assistant \ or "USER: Hi I am the assistant \. or > Hi, I am the asisstant \
I tried changing the eos_token in the tokenizer_config. Also tried various different stopping token ids in exllamav2 chat_template. Also tried setting the encoding options to false, false, false. I am using a fine-tuned llama3-70b (with axolotl), then quantized over exllamav2. Anyone found a setup with tokenizer_config etc. that worked for them?
Thanks