For both benchmarks, we have added support for the Zephyr chat template (which is the default produced by our scripts), so you can evaluate models produced by our scripts as follows:
Then the document says
Make sure the word zephyr exists in the --model-path argument when generating the model responses here. This will ensure the correct chat template is loaded.
However, I find that this is not true regarding the latest code provided by fastchat/llm_judge.
The provided template for tokenizer_config.json looks like:
'<|system|>\n
\n
<|user|>\n
Please provide the content of conditions.....\n
<|assistant|>\n
'
However, when evaluating zephyr (ensuring zephyr appears in model path), that chat template is
A chat between a curious human and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the human's questions.
### Human: Please provide the content of conditions.....
### Assistant:
My question is, which template did the technical report of zephyr uses when reporting 7.34 score on MT-bench dataset? Should I rewrite the code hosted by fastchat/llm_judge so that I can use chat template provided here.
It was stated in the readme that
Then the document says
However, I find that this is not true regarding the latest code provided by fastchat/llm_judge.
The provided template for tokenizer_config.json looks like:
'<|system|>\n \n <|user|>\n Please provide the content of conditions.....\n <|assistant|>\n '
However, when evaluating zephyr (ensuring zephyr appears in model path), that chat template is
A chat between a curious human and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the human's questions. ### Human: Please provide the content of conditions..... ### Assistant:
My question is, which template did the technical report of zephyr uses when reporting 7.34 score on MT-bench dataset? Should I rewrite the code hosted by fastchat/llm_judge so that I can use chat template provided here.
ps my command is
python3 -u gen_model_answer.py --model-path /home/huayu/git/alignment-handbook/data/zephyr-7b-dpo-lora_bz8_8_1_lr_4_e3_logfix/zephyr_checkpoint-969 --model-id zephyr-7b-dpo-lora_bz8_8_1_lr_4_e3_logfix_E1 --num-gpus-total 1