DPO阶段报错 - Githubissues

shibing624 / MedicalGPT

MedicalGPT: Training Your Own Medical GPT Model with ChatGPT Training Pipeline. 训练医疗大模型，实现了包括增量预训练(PT)、有监督微调(SFT)、RLHF、DPO、ORPO。

Apache License 2.0

2.94k stars 452 forks source link

DPO阶段报错 #334

Closed small-white-zs closed 4 months ago

small-white-zs commented 4 months ago

ValueError: You passed both a ref_model and a peft_config. For training PEFT adapters with DPO there is no need to pass a reference model. Please pass ref_model=None in case you want to train PEFT adapters.

= 屏幕截图 2024-02-29 104954

shibing624 commented 4 months ago

base模型是哪个？transformers，peft的版本也说下，描述下具体出现问题的过程，帮助我复现问题

small-white-zs commented 4 months ago

就是大佬github主页运行的副本colab

shibing624 commented 4 months ago

哦哦，可能是peft新版本更新导致的问题，我重新跑下试试。

small-white-zs commented 4 months ago

麻烦了

shibing624 commented 4 months ago

fixed.