mirthAI / Fast-DDPM

MIT License
51 stars 6 forks source link

Curious about fast training aspect #3

Closed sean1295 closed 2 weeks ago

sean1295 commented 1 month ago

Hello,

First of all, thanks for the great work and making it available to everyone.

I am interested in accelerating training for diffusion models and am curious about the method proposed in the paper.

So from my understanding, the idea is basically to only feed in inference timesteps during training (e.g., t ~ [999,899...99]) , but how is this different from using 10 diffusion steps for training (t ~ [10,9,8,...0]) and use the same diffusion steps during inference?

Please let me know and thanks for your time.

Sebastianjhx commented 1 month ago

Hi there, thanks for bringing up this interesting question.

Basically, since the SNR of the image relies on the cumprod of alpha, DDPM with 10 steps using beta noise scheduler is unlikely to reach Gaussian distribution in the final step, illustrated in figure 2 of our paper. However, if you adjust the alpha_cumprod values for each step in 10-step noise scheduler [10,9,8...0] to match the SNR plot as in the original DDPM, I believe there would be no difference then.