microsoft / DeepSpeed

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
https://www.deepspeed.ai/
Apache License 2.0
34.63k stars 4.04k forks source link

[REQUEST] RRHF #3267

Open GanjinZero opened 1 year ago

GanjinZero commented 1 year ago

Is your feature request related to a problem? Please describe. We have posted a paper with codes [RRHF] (https://github.com/GanjinZero/RRHF) that can achieve human alignment without RLHF. RRHF needs 1-2 models compared with PPO needs 4-5 models which can reduce GPU memory consumption significantly in order to scale to larger LLMs. Is it interesting to add this feature to DeepSpeed or DeepSpeed-chat?

yaozhewei commented 1 year ago

Hi @GanjinZero, we are definitely interested and super welcome to contribute! Do you have any plan and details to share or we can figure it out together.