CarperAI / trlx

A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)
MIT License
4.52k stars 470 forks source link

Minimum Risk Training support #354

Open alexandremuzio opened 1 year ago

alexandremuzio commented 1 year ago

🚀 The feature, motivation, and pitch

I've been working on RLHF for a while and have been exploring the use of Minimum Risk Training (paper: here with further investigations here) for improving encoder-decoder translation models by RL finetuning. It's an interesting procedure that is a lot simpler to PPO but seems to be a lot more stable for the translation setup I've been working with.

I'm wondering if integrating this new training procedure would be of interest to anyone and if so I could work on adding this.

For my experiments, I've been using MarianMT huggingface enc/dec models (which could also be integrated) but should also work with the currently supported T5 models as well and possibly LMs as well.

Alternatives

No response

Additional context

No response

LouisCastricato commented 1 year ago

Hey sure this sounds interesting. If you get this working well, I think I would be open to collaborating on a joint blog post. Including a new training paradigm in trlX is substantial effort on our part. If there is enough demand for it (which I think could potentially be rallied by a blog post) we would certainly be open to considering it :)

Can you send me an email? louis@stability.ai

alexandremuzio commented 1 year ago

That's great! I've sent you an email and I can update this thread here with next steps etc