zoubohao / DenoisingDiffusionProbabilityModel-ddpm-

This may be the simplest implement of DDPM. You can directly run Main.py to train the UNet on CIFAR-10 dataset and see the amazing process of denoising.
MIT License
1.59k stars 170 forks source link

Deadly Bug found and solved !!! Must be fixed !!! #14

Open TZYSJTU opened 1 year ago

TZYSJTU commented 1 year ago

The beta calculation is wrong! Compared with the official DDPM code, three important lines are missing in this unofficiial version !

Below are the official codes of beta's calculation: def linear_beta_schedule(timesteps): """ linear schedule, proposed in original ddpm paper """ scale = 1000 / timesteps beta_start = scale 0.0001 beta_end = scale 0.02 return torch.linspace(beta_start, beta_end, timesteps, dtype = torch.float64)

the scale related three lines are missing in your code, which leads to a deadly bug:

when using T = 100 or 50 or 20, much less than 1000 the calcuated x_T is not standard Gaussian! Therefore, you can't sample from a standard Gaussian after training.

You can easily check my word by calculating the mean and var of x_T !

jinzi98 commented 1 year ago

I think T need to be a big number to make x_T approximately equal to a standard Gaussian. 20, 50 and 100 are too small so x_T is not Gaussian.

TZYSJTU commented 1 year ago

wrong! What ever T=50, 100, 200, 1000.... x_T is standard Gaussian!  

Einsames Brandenburg @.***

 

------------------ 原始邮件 ------------------ 发件人: @.>; 发送时间: 2023年3月1日(星期三) 下午5:45 收件人: @.>; 抄送: "Einsames @.>; @.>; 主题: Re: [zoubohao/DenoisingDiffusionProbabilityModel-ddpm-] Deadly Bug found and solved !!! Must be fixed !!! (Issue #14)

I think T need to be a big number to make x_T approximately equal to a standard Gaussian. 20, 50 and 100 are too small so x_T is not Gaussian.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

HIT-LiuChen commented 1 year ago

Here is not deadly bug. In official code, scale is related to timesteps but if read the code carefully, you wiil find that timesteps is set to 1000 in modelConfig. I also don't think X_T is standard Gaussian when using T = 100 or 50 or 20 much less than 1000. In other words, if it becomes standard Gaussian in preceding step, there is no need to set T as 1000 or more.

TZYSJTU commented 1 year ago

You are totally wrong. I think you should so some work about math.   

Einsames Brandenburg @.***

 

------------------ 原始邮件 ------------------ 发件人: @.>; 发送时间: 2023年3月23日(星期四) 下午3:45 收件人: @.>; 抄送: "Einsames @.>; @.>; 主题: Re: [zoubohao/DenoisingDiffusionProbabilityModel-ddpm-] Deadly Bug found and solved !!! Must be fixed !!! (Issue #14)

Here is not deadly bug. In official code, scale is related to timesteps but if read the code carefully, you wiil find that timesteps is set to 1000 in modelConfig. I also don't think X_T is standard Gaussian when using T = 100 or 50 or 20 much less than 1000. In other words, if it becomes standard Gaussian in preceding step, there is no need to set T as 1000 or more.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

ljh430 commented 3 months ago

You are totally wrong. I think you should so some work about math.    Einsames Brandenburg @.   ------------------ 原始邮件 ------------------ 发件人: @.>; 发送时间: 2023年3月23日(星期四) 下午3:45 收件人: @.>; 抄送: "Einsames @.>; @.>; 主题: Re: [zoubohao/DenoisingDiffusionProbabilityModel-ddpm-] Deadly Bug found and solved !!! Must be fixed !!! (Issue #14) Here is not deadly bug. In official code, scale is related to timesteps but if read the code carefully, you wiil find that timesteps is set to 1000 in modelConfig. I also don't think X_T is standard Gaussian when using T = 100 or 50 or 20 much less than 1000. In other words, if it becomes standard Gaussian in preceding step, there is no need to set T as 1000 or more. — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.>

so you believe we need to add below code: scale = 1000 / timesteps beta_start = scale 0.0001 beta_end = scale 0.02

to scale the beta plan when set T<1000, right?