zhaoyl18 / SEIKO

SEIKO is a novel reinforcement learning method to efficiently fine-tune diffusion models in an online setting. Our methods outperform all baselines (PPO, classifier-based guidance, direct reward backpropagation) for fine-tuning Stable Diffusion.
https://arxiv.org/abs/2402.16359
MIT License
14 stars 0 forks source link