lora微调以后合并使用vllm输出停不下来

We-IOT / chatglm3_6b_finetune

基于chatglm3-6b模型的lora方法的微调

GNU General Public License v3.0

76 stars 13 forks source link

Open bchengwang opened 7 months ago

bchengwang commented 7 months ago

lora微调以后合并使用vllm输出停不下来。 @We-IOT 老师您有没有遇到啊？

We-IOT commented 7 months ago

具体描述一下

bchengwang commented 7 months ago

具体描述一下 @We-IOT 老师，就是我首先使用chatglm3-6b的的lora微调方法。微调完成以后使用您发布的lora合并的代码合并模型。接着使用vllm部署模型然后调用接口会出现一直出入提问的回复并且输出的内容中带有官方的特殊token <|user|><|assistant|>

We-IOT commented 7 months ago

没使用过vllm部署过，您使用trasform的预测出现这种情况吗？貌似上面进入一种循环，在连续对话中退不出来了理论上 <|assistant|>应该是模型的输出，不应该再次进入input中。