openai / improved-diffusion

Release for Improved Denoising Diffusion Probabilistic Models
MIT License
3.31k stars 489 forks source link

A problem about the weight λ of Lvlb #114

Open yinguanchun opened 1 year ago

yinguanchun commented 1 year ago

In the paper, λ is 0.001. The code sets learn_sigma as True and rescale_learned_sigmas as False, so the loss type will be gd.LossType.MSE, in this loss type ,the Lvlb will not multply 0.001. Even if the loss type is gd.LossType.RESCALED_MSE, terms["vb"] *= self.num_timesteps / 1000.0, what is self.num_timesteps, and what is its effect? Thank you .

zen-d commented 10 months ago

@yinguanchun I am also confused about this scaling factor, have you understood that?

Feynman1999 commented 6 months ago

I am also confused about this scaling factor, have you understood that?

yhy258 commented 4 months ago

In my opinion, authors define L_{vlb} = L_0 + ... + L_T, not L_t. Thus, they may calculate the vlb loss with scale factor T (self.num_timestep).

unl1002 commented 1 week ago

@yhy258 Thank you for your answer, so, which means we use L_t * T (self. numtimestep) to approximate L {vlb}?