CarperAI / DRLX

Diffusion Reinforcement Learning Library
MIT License
171 stars 7 forks source link

Add Direct Preference Optimization support #3

Open tmabraham opened 1 year ago

tmabraham commented 1 year ago

There should be a way to do Direct Preference Optimization with diffusion models. Ryan Murdock already has it working: https://twitter.com/advadnoun/status/1677479082752364546

Requires further investigation.