微调后的模型回答带有很多重复[Bug]

InternLM / lmdeploy

LMDeploy is a toolkit for compressing, deploying, and serving LLMs.

https://lmdeploy.readthedocs.io/en/latest/

Apache License 2.0

4.16k stars 376 forks source link

微调后的模型回答带有很多重复[Bug] #1190

Open xxg98 opened 6 months ago

xxg98 commented 6 months ago

Checklist

[ ] 1. I have searched related issues but cannot get the expected help.
[ ] 2. The bug has not been fixed in the latest version.

Describe the bug

启动微调后的模型，回答带有很多重复（微调的internlm2-chat-7b），对模型提的问题是：《快乐星球》的主演有哪些？前面正常，后面就开始重复（经多次提问，出现这种重复的概率不大）：

Reproduction

lmdeploy serve api_server /root/autodl-tmp/projects/LLM/fine_tuning/7b/internlm2-chat-7b-merge --server-name 0.0.0.0 --server-port 6006 --tp 1 --session-len 128000 --rope-scaling-factor 2 --cache-max-entry-count 0.25

Environment

python：3.10
显卡：4090
cuda：12.2
lmdeploy：0.2.3
xtuner：0.1.14.dev0

Error traceback

No response

xxg98 commented 6 months ago

又出现了一次

lvhan028 commented 6 months ago

推理的时候，试试设置 repetition_penalty 为1.02

xxg98 commented 6 months ago

1.02

增加该参数之后语无伦次了，hh

lvhan028 commented 6 months ago

用transformers推理的结果是怎样的呢？

xxg98 commented 6 months ago

用transformers推理的结果是怎样的呢？

具体是什么代码呢，形如下面这样的吗？

model = AutoModelForCausalLM.from_pretrained(model_path, torch_dype=torch.float16, trust_remote_code=True).cuda()
tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)

lvhan028 commented 6 months ago

对。只是prompt送给transformers的model之前，要先根据对话模板拼接一下。是用xtuner微调的么？如果是的话，印象中xtuner有chat的功能，可以试试

xxg98 commented 6 months ago

是用xtuner微调的

是用xtuner微调的，您的意思是说用xtuner的chat试试，会不会发生这种重复的情况吗？

lvhan028 commented 6 months ago

恩，是这个意思

lvhan028 commented 6 months ago

@pppppM 还请帮忙提供下xtuner chat的方式

pppppM commented 6 months ago

@xxg98 可以先用 xtuner chat 验证一下训练后的模型是否正常对话 xtuner chat /root/autodl-tmp/projects/LLM/fine_tuning/7b/internlm2-chat-7b-merge --prompt-template internlm2_chat

同时，可以看下训练日志中，EvalHook 的输出是否正常

xxg98 commented 6 months ago

恩，是这个意思

我会去去试试的

xxg98 commented 6 months ago

valHook

损失下降挺正常的，针对配置中evaluation_inputs的回答也是挺正常的(和训练的数据集对的上)