huggingface / diffusers

🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch and FLAX.
https://huggingface.co/docs/diffusers
Apache License 2.0
25.2k stars 5.21k forks source link

Formulation of reverse diffusion process in DDPM ( 1 - alpha_prod_t = beta_prod_t assumption) #9474

Open jadevaibhav opened 7 hours ago

jadevaibhav commented 7 hours ago

Describe the bug

I looked into sampling code of DDPM, and I believe there's a mistake:

I believe the code makes assumption that 1 - alpha_prod_t = beta_prod_t, which simply isn't true.

Original sampling algorithm: original paper image

x_0 prediction from sample and predicted noise :

Eqn 15 from paper, as referenced in code snippet as well: Screenshot 2024-09-19 at 6 26 48 PM

code implementation in diffusers: https://github.com/huggingface/diffusers/blob/1e8cf2763d37fa66cfba2fb87acfe3443068ce43/src/diffusers/schedulers/scheduling_ddpm.py#L446-L449

Here, it clearly seems like the 1 - alpha_prod_t = beta_prod_t is being used.

investigated from post by @AlejandroBaron in https://github.com/huggingface/diffusers/discussions/9431_

Reproduction

I haven't tested out code, but this seems fundamental formulation issue.

Logs

No response

System Info

0.30.3

Who can help?

@sayakpaul @DN6 @yiy

jadevaibhav commented 5 hours ago

It looks like its a unfortunate naming, where beta_prod_t is defined as 1 - alpha_prod_t: https://github.com/huggingface/diffusers/blob/1e8cf2763d37fa66cfba2fb87acfe3443068ce43/src/diffusers/schedulers/scheduling_ddpm.py#L438-L444

I think it should be renamed for clarity.

asomoza commented 5 hours ago

cc: @yiyixuxu

sayakpaul commented 5 hours ago

Perhaps a comment could be added to clarify the confusion. That seems like the easiest solution to me.