yk7333 / d3po

[CVPR 2024] Code for the paper "Using Human Feedback to Fine-tune Diffusion Models without Any Reward Model"
https://arxiv.org/abs/2311.13231
MIT License
168 stars 14 forks source link