UnicomAI / Unichat-llama3-Chinese

Apache License 2.0
340 stars 34 forks source link

示例推理跟,llama_factory的web推理 有出入 #7

Open Micla-SHL opened 4 months ago

Micla-SHL commented 4 months ago

你好,这是LLaMA-Factory的template.py _register_template( name="llama3", format_user=StringFormatter( slots=[ "<|start_header_id|>user<|end_header_id|>\n\n{{content}}<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\n" ] ), format_system=StringFormatter(slots=[{"bos_token"}, "{{content}}"]), stop_words=["<|eot_id|>"], replace_eos=True, force_system=True, )

它并不能真正的在推理的时候结束对话:

用户:你怎么了
Unichat-llama3-Chinese: 你好!我是一个人工智能助手,很高兴见到你。有什么我可以帮助你的?.epend2.epend2.epend2.epend2.epend2.epend2.epend2.epend2.epend2.epend2.epend2.epend2.epend2.epend2.epend2.epend2.epend2.epend2.........(省略号是意味着epend2.非常多,我只是截取了一部分)

我觉得是slots=[ "<|start_header_id|>user<|end_header_id|>\n\n{{content}}<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\n"写得不对,我该怎么修复它呢

我观察到的示例推理代码只是便捷的做了一句: terminators = [ pipeline.tokenizer.eos_token_id, pipeline.tokenizer.convert_tokens_to_ids("<|eot_id|>") ]

Micla-SHL commented 4 months ago

听闻你们跟官方的对话模板不一样,这个是什么东西,能再给一点提示吗? 我想自己再修改一下

hiyouga commented 4 months ago
_register_template(
    name="unichat",
    format_user=StringFormatter(slots=["Human:{{content}}\nAssistant:"]),
    format_system=StringFormatter(slots=[{"bos_token"}, "{{content}}\n"]),
    format_separator=EmptyFormatter(slots=["\n", {"bos_token"}]),
    default_system="A chat between a curious user and an artificial intelligence assistant.The assistant gives helpful, detailed, and polite answers to the user's questions.",
    stop_words=["<|eot_id|>"],
)
Micla-SHL commented 4 months ago
_register_template(
    name="unichat",
    format_user=StringFormatter(slots=["Human:{{content}}\nAssistant:"]),
    format_system=StringFormatter(slots=[{"bos_token"}, "{{content}}\n"]),
    format_separator=EmptyFormatter(slots=["\n", {"bos_token"}]),
    default_system="A chat between a curious user and an artificial intelligence assistant.The assistant gives helpful, detailed, and polite answers to the user's questions.",
    stop_words=["<|eot_id|>"],
)

谢谢,我大概理解了,我会自己调试的

UnicomAI commented 4 months ago
_register_template(
    name="unichat",
    format_user=StringFormatter(slots=["Human:{{content}}\nAssistant:"]),
    format_system=StringFormatter(slots=[{"bos_token"}, "{{content}}\n"]),
    format_separator=EmptyFormatter(slots=["\n", {"bos_token"}]),
    default_system="A chat between a curious user and an artificial intelligence assistant.The assistant gives helpful, detailed, and polite answers to the user's questions.",
    stop_words=["<|eot_id|>"],
)

谢谢,我大概理解了,我会自己调试的

_register_template( name="llama3-unichat", format_user=StringFormatter(slots=["Human:{{content}}\nAssistant:"]), format_assistant=StringFormatter(slots=["{{content}}<|end_of_text|>"]), format_system=StringFormatter(slots=["<|begin_of_text|>{{content}}"]), default_system="A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions.\n" )

hiyouga commented 4 months ago
_register_template(
    name="unichat",
    format_user=StringFormatter(slots=["Human:{{content}}\nAssistant:"]),
    format_system=StringFormatter(slots=[{"bos_token"}, "{{content}}\n"]),
    format_separator=EmptyFormatter(slots=["\n", {"bos_token"}]),
    default_system="A chat between a curious user and an artificial intelligence assistant.The assistant gives helpful, detailed, and polite answers to the user's questions.",
    stop_words=["<|eot_id|>"],
)

谢谢,我大概理解了,我会自己调试的

_register_template( name="llama3-unichat", format_user=StringFormatter(slots=["Human:{{content}}\nAssistant:"]), format_assistant=StringFormatter(slots=["{{content}}<|end_of_text|>"]), format_system=StringFormatter(slots=["<|begin_of_text|>{{content}}"]), default_system="A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions.\n" )

Hello, thanks for your response, according to your tokenizer's chat template at Hugging Face, we think the template we provided should produce a more accurate prompt.

<|begin_of_text|>A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions.
Human:prompt
Assistant:answer<|end_of_text|>
<|begin_of_text|>Human:prompt
Assistant: