Closed zwhe99 closed 10 months ago
Thank you for your great supports to us! Because reward model training involves more methods, this part will be explained in the second part of the technical report, thank you for your support and recognition!
I see. Thanks for your response!
Will you make the training script of the reward model public?