Closed StrangeBytesDev closed 5 days ago
I can confirm that this is indeed a bug. Before, the chat template of llama 2 & mistral was not very well documented, so maybe there was a confusion when implementing it on llama.cpp
Its still pretty darn ambiguous. The chat template included with Mistral-7b-v0.1, Mistral-Small, and Mistral-Large all put whitespace between the prompt and the [INST]
tokens, but Mistral Nemo does not. ~(Although I suspect that to be a mistake.)~
Looks like there's some nuance related to tokenization: https://huggingface.co/mistralai/Mistral-Nemo-Instruct-2407/discussions/56
They also put the system prompt before the last user message, but suggest that system prompts are only softly supported, and the placement may not be important.
https://huggingface.co/mistralai/Mistral-Nemo-Instruct-2407/discussions/47
This issue was closed because it has been inactive for 14 days since being marked as stale.
What happened?
Chat template formatting seems to be swapped for Mistral and Llama 2. Llama2 supports the
<<SYS>>
token for system messages, while Mistral simply uses newlines.Starting llama server with "--chat-template llama2" returns the chat template:
Starting with "--chat-template mistral" returns the chat template:
Running a test prompt through in verbose mode verifies that it isn't just the initial reported example, but it is in fact swapping the templates.
Name and Version
./llama-server --version version: 3785 (64c6af31) built with cc (Ubuntu 13.2.0-23ubuntu4) 13.2.0 for x86_64-linux-gnu
What operating system are you seeing the problem on?
Linux
Relevant log output