Open sha367 opened 1 month ago
This is a known issue caused by using the wrong template with a model. Templates are ment to be stored in the gguf but often isn't leading to it falling back to chatml
try settings the template parameter to "gemma" or "llama2"
The available template strings are "chatml", "llama2", "phi3", "zephyr", "monarch", "gemma", "gemma2", "orion", "vicuna", "vicuna-orca", "deepseek", "command-r", "llama3", "minicpm" and "deepseek2"
These are the templates llama.cpp handles out of the box and if they arnt set in gguf file theres currently no way of detecting which template should be used.
Hi @danemadsen, thanks for the previous comment! I tried to set template parameter to gemma or gemma2, but still model doesn't work.
Also after that I tried bunch of other models and I got wrong answers or maid quit unexpectedly.
Hello. I have same issues.
We are testing MacOS version of the Maid with gguf models from huggingface (https://huggingface.co/bartowski and https://huggingface.co/QuantFactory) and they start to speak with themselves. With anythingLLM these models work fine. I assume there is some issue with end of response token initialization.