Open jtawade opened 2 years ago
I'm curious about this, too. Have you found any reason? logvar does simply seems to reduce over training steps if trained. I see that this affects to the decoder variance, but this does not seem to converge.
Curious about this, too.
During training why is the loss scaled by a log variance value that is learned by the model?
https://github.com/CompVis/stable-diffusion/blob/ce05de28194041e030ccfc70c635fe3707cdfc30/ldm/models/diffusion/ddpm.py#L1031
This log variance value isn't seemed to be used anywhere else except to scale the loss.