OpenLMLab / MOSS-RLHF

MOSS-RLHF
Apache License 2.0
1.19k stars 88 forks source link

Training script of reward model #14

Closed zwhe99 closed 10 months ago

zwhe99 commented 10 months ago

Will you make the training script of the reward model public?

Ablustrund commented 10 months ago

Thank you for your great supports to us! Because reward model training involves more methods, this part will be explained in the second part of the technical report, thank you for your support and recognition!

zwhe99 commented 10 months ago

I see. Thanks for your response!