shibing624 / MedicalGPT

MedicalGPT: Training Your Own Medical GPT Model with ChatGPT Training Pipeline. 训练医疗大模型,实现了包括增量预训练(PT)、有监督微调(SFT)、RLHF、DPO、ORPO。
Apache License 2.0
2.94k stars 451 forks source link

大佬,DPO可以改成inputIds和attention_mask 输入吗 #383

Open Faded1022 opened 1 week ago

Faded1022 commented 1 week ago

Describe the Question

Please provide a clear and concise description of what the question is.

shibing624 commented 1 week ago

自己试试吧,需要改trl源码。