zoubohao / DenoisingDiffusionProbabilityModel-ddpm-

This may be the simplest implement of DDPM. You can directly run Main.py to train the UNet on CIFAR-10 dataset and see the amazing process of denoising.
MIT License
1.48k stars 156 forks source link

How to determin if 他和 #40

Closed MithianH closed 4 months ago

MithianH commented 4 months ago
          Sorry to bother you, but I wonder if we can judge wether the net coverges by the value of loss.  

Because the loss in 170th epoch is not significantly different from that at begining. Like this:
....
100%|██████████| 625/625 [03:20<00:00, 3.12it/s, epoch=4, loss: =6.78, img shape: =torch.Size([80, 3, 32, 32]), LR=0.00012]
100%|██████████| 625/625 [03:20<00:00, 3.12it/s, epoch=5, loss: =8.38, img shape: =torch.Size([80, 3, 32, 32]), LR=0.000125]
····
100%|██████████| 625/625 [03:21<00:00, 3.11it/s, epoch=168, loss: =6.01, img shape: =torch.Size([80, 3, 32, 32]), LR=3.27e-5]
100%|██████████| 625/625 [03:21<00:00, 3.10it/s, epoch=169, loss: =9.78, img shape: =torch.Size([80, 3, 32, 32]), LR=3.15e-5]
100%|██████████| 625/625 [03:21<00:00, 3.10it/s, epoch=170, loss: =6.75, img shape: =torch.Size([80, 3, 32, 32]), LR=3.04e-5]

Thank you very much : )

Originally posted by @GritLs in https://github.com/zoubohao/DenoisingDiffusionProbabilityModel-ddpm-/issues/5#issuecomment-1288088155

Hello, I'm encountering a problem similar to what you've described: I've trained for many steps, but it seems like the loss isn't converging enough, even staying close to the initial loss. However, from the training results, the generated images seem quite satisfactory. I've read through all the discussions in the issue, but I still couldn't grasp how the convergence of loss is determined. Could you or the author please advise on how you ultimately resolved this issue? Thank you all. 作者你好,我遇到了和这位朋友类似的问题:我的loss看起来一直都不够收敛。但是训练的模型表现其实还可以。请问这是正常的吗?还是有其他的问题存在呢?希望能得到您的回复。 (抱歉打错了标题,不小心多按了一次回车)

zoubohao commented 4 months ago

Yes, it is normal. The key point of DDPM is to train a model to predict the noise from a Gaussian distribution. It cannot predict the noise exactly. The longer you train the model, the better performance it will have.

MithianH commented 4 months ago

Yes, it is normal. The key point of DDPM is to train a model to predict the noise from a Gaussian distribution. It cannot predict the noise exactly. The longer you train the model, the better performance it will have.

Thank you for your reply!