Models speak with themselves

Mobile-Artificial-Intelligence / maid

Maid is a cross-platform Flutter app for interfacing with GGUF / llama.cpp models locally, and with Ollama and OpenAI models remotely.

MIT License

1.26k stars 124 forks source link

Models speak with themselves #605

Open sha367 opened 1 month ago

sha367 commented 1 month ago

We are testing MacOS version of the Maid with gguf models from huggingface (https://huggingface.co/bartowski and https://huggingface.co/QuantFactory) and they start to speak with themselves. With anythingLLM these models work fine. I assume there is some issue with end of response token initialization.

danemadsen commented 1 month ago

This is a known issue caused by using the wrong template with a model. Templates are ment to be stored in the gguf but often isn't leading to it falling back to chatml

danemadsen commented 1 month ago

try settings the template parameter to "gemma" or "llama2"

The available template strings are "chatml", "llama2", "phi3", "zephyr", "monarch", "gemma", "gemma2", "orion", "vicuna", "vicuna-orca", "deepseek", "command-r", "llama3", "minicpm" and "deepseek2"

These are the templates llama.cpp handles out of the box and if they arnt set in gguf file theres currently no way of detecting which template should be used.

sha367 commented 1 month ago

Hi @danemadsen, thanks for the previous comment! I tried to set template parameter to gemma or gemma2, but still model doesn't work.

sha367 commented 1 month ago

Also after that I tried bunch of other models and I got wrong answers or maid quit unexpectedly.

Bondigame-personal commented 1 month ago

Hello. I have same issues.