MC-E / DragonDiffusion

ICLR 2024 (Spotlight)
Apache License 2.0
718 stars 22 forks source link

Question about paper (Unusual loss function design) #5

Open YunhoKim21 opened 1 year ago

YunhoKim21 commented 1 year ago

Hi, I really appreciate the work than enables drag based image editing in diffusion models. The results look good.

One thing curious is about the loss function design. In equation 5 in the paper, the total loss function incorporates two cosine similarities. The conventional way would be adding up to cosine similarities with weights. However in the paper, author decided to add inverse of two cosines with some constant(alpha) added. I am curious where the idea of this such design came from and if the choice is based on empirical or theoretical.

Thank You!

MC-E commented 1 year ago

Hi, our loss function aims to maximize the similarity, so we take the inverse of the similarity. To prevent the denominator from being zero, we add a coefficient.

david20571015 commented 5 months ago

Hi, @MC-E. In your paper, the denoising process will increase $log q(y|z_t)$ which is the energy function $E$. But increases $E$ will decrease the cosine similarities. Is it right or do I misunderstanding anything?