intel-analytics / text-generation-webui

A Gradio Web UI for running local LLM on Intel GPU (e.g., local PC with iGPU, discrete GPU such as Arc, Flex and Max) using IPEX-LLM.
GNU Affero General Public License v3.0
14 stars 8 forks source link

Add instruction template for Qwen and chatGLM3 #13

Closed chtanch closed 6 months ago

chtanch commented 6 months ago

Usage:

References used for template format:

Qwen - https://github.com/vllm-project/vllm/issues/1914 ChatGLM3 - https://github.com/THUDM/ChatGLM3/blob/main/PROMPT_en.md  

Generated input with formatting tokens

Qwen:

<|im_start|>system Your are a helpful assistant.<|im_end|> <|im_start|>user What is the capital of France?<|im_end|> <|im_start|>assistant The capital of France is Paris.<|im_end|> <|im_start|>user What is the capital of Australia?<|im_end|> <|im_start|>assistant The capital of Australia is Canberra.<|im_end|> <|im_start|>user What is AI?<|im_end|> <|im_start|>assistant

chatGLM3:

<|system|> Your are a helpful assistant. <|user|> What is the capital of France? <|assistant|> The capital of France is Paris. <|user|> What is the capital of Australia? <|assistant|> The capital of Australia is Canberra. <|user|> What is AI? <|assistant|>

Results

Results for Qwen-7b seem to be better than with the hardcoded function. Results for ChatGLM3-6b seem to be better than with default chat template.

chtanch commented 6 months ago

Detection and auto-loading of Instruct template for Qwen-7b added. It is pointless to do the same for ChatGLM3. This is because ` <ChatGLM3 model folder>/tokenizer_config.json contains a template, which is placed at a higher priority than custom Instruct templates by webUI. However, the template in tokenizer_config.json seems to be incorrect and does not work well.