mihirp1998 / AlignProp

AlignProp uses direct reward backpropogation for the alignment of large-scale text-to-image diffusion models. Our method is 25x more sample and compute efficient than reinforcement learning methods (PPO) for finetuning Stable Diffusion
https://align-prop.github.io/
MIT License
242 stars 8 forks source link