Open nmandic78 opened 7 months ago
Tnx! I found it here: https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct/blob/c9231f629c54de150fe4cca99a98034f32fb589e/tokenizer_config.json#L2053 It is new. My usecase is using server from llama.cpp and my custom python code calling it, but unfortunately llama.cpp server executable currently doesn't support custom prompt templates so I will find a workaround or, as llama3 is hot, ggerganov will add template before I do.
@nmandic78 Can you paste promt format here please so no one has to jump back and forward? Thanks!
Of course! As defined in tokenizer_config.json: {% set loop_messages = messages %}{% for message in loop_messages %}{% set content = '<|start_header_id|>' + message['role'] + '<|end_header_id|>\n\n'+ message['content'] | trim + '<|eot_id|>' %}{% if loop.index0 == 0 %}{% set content = bos_token + content %}{% endif %}{{ content }}{% endfor %}{{ '<|start_header_id|>assistant<|end_header_id|>\n\n' }}
And applied in llama.cpp main exe for example: ./main -m ~/models/Meta-Llama-3-8B-Instruct.Q8_0.gguf --color -n -2 -e -s 0 -p '<|begin_of_text|><|start_header_id|>system<|end_header_id|>\n\nYou are a helpful assistant.<|eot_id|>\n<|start_header_id|>user<|end_header_id|>\n\nHi!<|eot_id|>\n<|start_header_id|>assistant<|end_header_id|>\n\n' -ngl 99 --mirostat 2 -c 8192 -r '<|eot_id|>' --in-prefix '\n<|start_header_id|>user<|end_header_id|>\n\n' --in-suffix '<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\n' -i
Unfortunately my use case is using server exe (from llama.cpp) and there custom prompt templates are not supported (only hardcoded list of them). I could try formatting it in python, but I think I will wait a day or to to be added.
Still, hope this will help someone.
<|start_header_id|>system<|end_header_id|> {System prompt}<|eot_id|>\n\n<|start_header_id|>user<|end_header_id|> {User Prompt}<|eot_id|>\n\n<|start_header_id|>assistant<|end_header_id|> {Model response}<|eot_id|>
This should be working with llama3.
<|begin_of_text|><|start_header_id|>system<|end_header_id|>
{{ system_prompt }}<|eot_id|><|start_header_id|>user<|end_header_id|>
{{ user_message_1 }}<|eot_id|><|start_header_id|>assistant<|end_header_id|>
{{ model_answer_1 }}<|eot_id|><|start_header_id|>user<|end_header_id|>
{{ user_message_2 }}<|eot_id|><|start_header_id|>assistant<|end_header_id|>
For more details, check out Llama recipes - https://github.com/meta-llama/llama-recipes
<|begin_of_text|><|start_header_id|>system<|end_header_id|> You are a helpful assistant<|eot_id|><|start_header_id|>user<|end_header_id|> {prompt}<|eot_id|><|start_header_id|>assistant<|end_header_id|>
Replace the {prompt} element with your template, and it will work.
thanks for this! just added support for it on litellm - https://github.com/BerriAI/litellm/commit/df7db2b870d2e1201888bb625c446e4473759ffb
You can now make calls to llama3 models on vllm etc. in the openai format
What prompt template llama3 use? Keep getting "assistant" at end of generation when using llama2 or chatml template. Using instruct variant.