OpenLLMAI / OpenRLHF

An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & Mixtral)
https://openrlhf.readthedocs.io/
Apache License 2.0
1.71k stars 160 forks source link

使用Deepseek-lite训练DPO,显示expected mat1 and mat2 to have the same type, but got: float != c10: : BFLoat16 #306

Open victorShawFan opened 1 month ago

victorShawFan commented 1 month ago
image 企业微信截图_3e4b1898-9b82-43cc-847e-1404732e4e31 企业微信截图_14e39446-a286-4694-906d-2b5f8dd0cb59 企业微信截图_a14a0465-8a4c-49a6-8f17-0f168c97ca3f
hijkzzz commented 1 month ago

通常这种情况可以尝试升级 deepspeed 到最新版(但是也不能确定 100% work)

victorShawFan commented 1 month ago

确实没有work

lixsh6 commented 4 weeks ago

确实没有work

请问有fix的办法吗?同遇到类似问题