Open YunhoKim21 opened 1 year ago
Hi, our loss function aims to maximize the similarity, so we take the inverse of the similarity. To prevent the denominator from being zero, we add a coefficient.
Hi, @MC-E. In your paper, the denoising process will increase $log q(y|z_t)$ which is the energy function $E$. But increases $E$ will decrease the cosine similarities. Is it right or do I misunderstanding anything?
Hi, I really appreciate the work than enables drag based image editing in diffusion models. The results look good.
One thing curious is about the loss function design. In equation 5 in the paper, the total loss function incorporates two cosine similarities. The conventional way would be adding up to cosine similarities with weights. However in the paper, author decided to add inverse of two cosines with some constant(alpha) added. I am curious where the idea of this such design came from and if the choice is based on empirical or theoretical.
Thank You!