guinmoon / LLMFarm

llama and other large language models on iOS and MacOS offline using GGML library.
https://llmfarm.site
MIT License
1.05k stars 62 forks source link

Support ChatML template? #71

Closed actow closed 2 weeks ago

actow commented 3 weeks ago

The Qwen2 models uses ChatML.

See https://huggingface.co/Qwen/Qwen2-7B-Instruct/blob/main/tokenizer_config.json

actow commented 3 weeks ago

More References:

https://github.com/ollama/ollama/blob/main/templates/chatml.gotmpl

guinmoon commented 3 weeks ago

Hi. You can manually create such a template.

BOS = false, EOS = false

[system](<|im_start|>system
You are a helpful assistant.)
<eos><|im_start|>user
{prompt}<eos><|im_start|>assistant

stop words = <|im_start|>,<eos>

yangtuo250 commented 2 weeks ago

My template:

BOS = false, EOS = false, Special = true

Format:


<|im_start|>system
You are a helpful assistant.
<eos>
<|im_start|>user
{prompt}<eos>
<|im_start|>assistant

Skip tokens: <|im_start|>,<eos>

Metal = true

iPhone12pro Qwen2-1.5B-Instruct-Q8_0 works fine with conversation and codegen.

My config json: Qwen2.json

actow commented 2 weeks ago

thanks a lot. It will be even better if that becomes a built-in template. Given that Qwen2 is gaining popularity and it has 0.5b and 1.5b models, I'd imagine this can be useful to many people.

actow commented 2 weeks ago

Also, is there a way to import templates?

yangtuo250 commented 2 weeks ago

Also, is there a way to import templates?

For iPhone, just put json file in "File" App -> My iPhone -> LLM Farm -> model_setting_templates.

actow commented 2 weeks ago

That worked. Thanks a lot.