有对比不加RLHF和加入RLHF的效果吗 - Githubissues

sunzeyeah / RLHF

Implementation of Chinese ChatGPT

282 stars 36 forks source link

有对比不加RLHF和加入RLHF的效果吗 #4

Closed macheng6 closed 1 year ago

macheng6 commented 1 year ago

如题。

sunzeyeah commented 1 year ago

你好，目前RLHF部分还在调试和优化。因为需要同时加载sft和reward模型，计算资源消耗较大，而且RL训练的收敛稳定性不好保证