huggingface / diffusers

🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch and FLAX.
https://huggingface.co/docs/diffusers
Apache License 2.0
26.08k stars 5.38k forks source link

"previous_timestep()" in DDPM scheduling not compatible with "trailing" option. DDIM bugged too #9261

Open RochMollero opened 2 months ago

RochMollero commented 2 months ago

Describe the bug

I'm currently on a quest to remove all noise from my very noise-sensitive diffusion application, so checking all the equations right now and I think I found one error:

in scheduling_ddmp.py, previous_timesteps() currently assumes two cases

However, though this formula may be worling for the "leading" scheduler option, it's not correct for "trailing", nor likely for "linspace" since there's a trick with round errors. For example if I use trailing with 1000 training steps and 13 generation steps we have the steps define in set_timesteps:

tensor([999, 922, 845, 768, 691, 614, 537, 461, 384, 307, 230, 153, 76])

but when the successive steps are actually applied we have:

t, prev_t : tensor(999) tensor(923) tensor(922) tensor(846) tensor(845) tensor(769) tensor(768) tensor(692) tensor(691) tensor(615) tensor(614) tensor(538) tensor(537) tensor(461) tensor(461) tensor(385) tensor(384) tensor(308) tensor(307) tensor(231) tensor(230) tensor(154) tensor(153) tensor(77)
tensor(76) tensor(0)

Basically, most 'prev_t' are wrong because of the rounding error.

@yiyixuxu and @bghira (from what I've seen about your contribution on 0snr & vprediction you might be relevant for the discussion.)

Also the same formula is used in DDIM so trailing is wrong too. Note that DDIM does not (currently, or maybe ever?) support custom_timesteps.

As a first fix, I believe setting using custom_timesteps=True for "trailing" option should work, but I suspect you may want to have more work to fix this in all schedulers.

Please not that this is quite important since trailing is being recognized as the most relevant way to step, leading and linspace being likely suboptimal for reasons explained in the iddpm and the "Common Diffusion Noise Schedules and Sample Steps are Flawed" paper.

Reproduction

Literally any model with trailing and a prime number for n_steps.

Logs

No response

System Info

Ubuntu 22

Who can help?

@bghira @yiyixuxu

RochMollero commented 2 months ago

Up ! because it seems important enough to me

RochMollero commented 2 months ago

up @yiyixuxu @bghira

yiyixuxu commented 2 months ago

@RochMollero thanks! indeed, it is incorrect do you want to open a PR to help us fix it? the easier option here is to modify the previous_timesteps so that it calculates differently based on self.config.timestep_spacing

AnandK27 commented 2 months ago

Hey @yiyixuxu @RochMollero, I will work on this!

github-actions[bot] commented 1 month ago

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

a-r-r-o-w commented 3 weeks ago

@yiyixuxu Is this to be closed based on your comment or is this not stale?