stanfordnlp / pyreft

ReFT: Representation Finetuning for Language Models
https://arxiv.org/abs/2404.03592
Apache License 2.0
947 stars 77 forks source link

[P0] Adding DPO Support #66

Closed jinzhuoran closed 2 months ago

jinzhuoran commented 2 months ago

Hi @frankaging, thanks for open source such a useful toolkit. I'm quite curious about how DPO could potentially integrate with REFT within your project. Could you share if there are any plans to incorporate DPO?

frankaging commented 2 months ago

Triaged as P1 since its a nice to have thing.

frankaging commented 2 months ago

@jinzhuoran yes! thanks for your interests! Integrating with DPO could be definitely cool -- given the fact that ReFT allows quick iterative adaptation. Additionally, a reward model trained on ReFT is essentially the base LM + a set of very small interventions. The same base model can be trained with another set of interventions for language completion in parallel. You don't need to load two copies of models into the memory as well.

In short, there are a lot of stuff to explore with DPO + ReFT, but we are currently looking for helps! If you want to do it, let us know! We could help on the side.

aryamanarora commented 2 months ago

btw it's super easy to implement this already with the existing ReftRewardTrainer and slight modification of the loss computation in DPOTrainer from the trlx library, will add to the library soon!

frankaging commented 2 months ago

@AmirZur will work on the DPO trainer, and make a PR soon! Local test with TruthfulQA seems to be promising.

jinzhuoran commented 2 months ago

Thank you for your help! I can't wait to try out this new feature!

jinzhuoran commented 2 months ago

Hi @AmirZur @frankaging, I'm trying to use ReFT on DPO, but I often encounter loss=nan. Have you ever experienced this situation?

AmirZur commented 2 months ago

Hi @jinzhuoran! I haven't run into loss=nan issues yet. You can find my implementation and a small walkthrough notebook in the amir/dpo branch -- you can use it to compare our implementations.

frankaging commented 2 months ago

marking this ticket as closed! feel free to open new ones for other questions!

DPO folder is here, Thanks @AmirZur !!