Add Direct Preference Optimization support

CarperAI / DRLX

Diffusion Reinforcement Learning Library

MIT License

171 stars 7 forks source link

Open tmabraham opened 1 year ago

tmabraham commented 1 year ago

There should be a way to do Direct Preference Optimization with diffusion models. Ryan Murdock already has it working: https://twitter.com/advadnoun/status/1677479082752364546

Requires further investigation.