shibing624 / MedicalGPT

MedicalGPT: Training Your Own Medical GPT Model with ChatGPT Training Pipeline. 训练医疗大模型,实现了包括增量预训练(PT)、有监督微调(SFT)、RLHF、DPO、ORPO。
Apache License 2.0
2.93k stars 450 forks source link

reward_modeling咨询 #361

Open tuqingwen opened 2 months ago

tuqingwen commented 2 months ago

Describe the Question

Please provide a clear and concise description of what the question is.

大佬,请问您新增的reward_modeling.py这一脚本是不是也可以用来训练评分器!数据集的形式就和data/reward一样把

shibing624 commented 2 months ago

可以。