zoubohao / DenoisingDiffusionProbabilityModel-ddpm-

This may be the simplest implement of DDPM. You can directly run Main.py to train the UNet on CIFAR-10 dataset and see the amazing process of denoising.
MIT License
1.48k stars 156 forks source link

doubt about the loss computation in the train.py #10

Open ZhouHaoWa opened 1 year ago

ZhouHaoWa commented 1 year ago

loss = trainer(x_0).sum() / 1000. why is 1000?

HBB0517 commented 1 year ago

because 1000 can create a isotropic

huchenz1 commented 1 year ago

because 1000 can create a isotropic

could you pleasr tell me what does "isotropic" mean? Thanks!

xpdd123 commented 1 year ago

just make gradient desent faster

Klawens commented 11 months ago

It's common to scale the loss by a constant factor to control the magnitude of gradients during training. Large gradients can lead to unstable training, so scaling the loss down can help stabilize the optimization process.