MC-E / DragonDiffusion

ICLR 2024 (Spotlight)
Apache License 2.0
695 stars 21 forks source link

No fine-tuning or extra module needed, why losses and learning rate? #1

Open DarrenIm opened 1 year ago

DarrenIm commented 1 year ago

Hi, interesting work. As mentioned in the paper:

All content editing and preservation signals in our proposed method come from the image itself. It allows for a direct translation of T2I generation ability in diffusion models to image editing tasks without the need for any model fine-tuning or training.

I'm confused by few losses and even learning rate defined in Eq. 5 and Eq. 6, what are they used for? Thanks in advance.

MC-E commented 1 year ago

Our method is designed based on the score function (https://arxiv.org/abs/1907.05600). This approach deals with the diffusion process from a continuous perspective, where a gradient needs to be calculated for each diffusion step. The loss function here serves as additional guidance for the gradient.