Open bdambrosio opened 5 months ago
I think the issue is different handling of tokeniser_config chat templates by tabby vs hf's tokenizer. In particular, llama-3 template doesn't initialize 'add_generation_prompt'. tabby seems to default to True. But my prompts, at least, run much better in llama-3 variants without that final empty Assistant message.
Well, it works, but text out is poor quality.