Open joe-aivatarz opened 5 months ago
Hmm the results do look good. But from the project page they mention
We leverage methods from stochastic calculus and find optimal schedules specific to different solvers, trained DMs and datasets.
So each scheduler would have it's own set of aligned steps. We can support this by allowing setting the timesteps directly for all schedulers.
@yiyixuxu wdyt? The Colab notebook example the authors have provided uses diffusers and modifies the DPM scheduler
from diffusers import DPMSolverMultistepScheduler as DefaultDPMSolver
# Add support for setting custom timesteps
class DPMSolverMultistepScheduler(DefaultDPMSolver):
def set_timesteps(
self, num_inference_steps=None, device=None,
timesteps=None
):
if timesteps is None:
super().set_timesteps(num_inference_steps, device)
return
all_sigmas = np.array(((1 - self.alphas_cumprod) / self.alphas_cumprod) ** 0.5)
self.sigmas = torch.from_numpy(all_sigmas[timesteps])
self.timesteps = torch.tensor(timesteps[:-1]).to(device=device, dtype=torch.int64) # Ignore the last 0
self.num_inference_steps = len(timesteps)
self.model_outputs = [
None,
] * self.config.solver_order
self.lower_order_nums = 0
# add an index counter for schedulers that allow duplicated timesteps
self._step_index = None
self._begin_index = None
self.sigmas = self.sigmas.to("cpu") # to avoid too much CPU/GPU communication
I'm wondering how to modify EDMDPMSolverMultistepScheduler to get similar result. I think EDMDPMSolverMultistepScheduler should be continuous diffusion models like EDM, the noise level values can be directly given to the model as their sigma inputs. I write a script, but it doen't work. Wondering how to fix it.
` import numpy as np
from diffusers import DiffusionPipeline, EDMDPMSolverMultistepScheduler as DefaultDPMSolver
class EDMDPMSolverMultistepScheduler(DefaultDPMSolver): def set_timesteps(self, num_inference_steps=None, device=None):
# self.num_inference_steps = num_inference_steps
# ramp = np.linspace(0, 1, self.num_inference_steps)
# sigmas = self._compute_sigmas(ramp)
sigmas = np.array([700.00, 54.5, 15.886, 7.977, 4.248, 1.789, 0.981, 0.403, 0.173, 0.034, 0.002])
# self.num_inference_steps = len(sigmas)
sigmas = torch.from_numpy(sigmas).to(dtype=torch.float32, device=device)
self.timesteps = self.precondition_noise(sigmas)
# self.sigmas = torch.cat([sigmas, torch.tensor([sigma_last], dtype=torch.float32, device=device)])
print("sigmas=", self.sigmas)
self.model_outputs = [
None,
] * self.config.solver_order
self.lower_order_nums = 0
# add an index counter for schedulers that allow duplicated timesteps
self._step_index = None
self._begin_index = None
self.sigmas = self.sigmas.to("cpu") # to avoid too much CPU/GPU communication
`
Having a set_timesteps
for all relevant schedulers would also make it much easier to implement things like this: https://github.com/huggingface/diffusers/issues/7651
@DN6
I'm a little bit confused, it says each scheduler would have its own optimized schedule but this is what they provide. Can these timesteps be used for all schedulers for these models?
cc @asomoza here maybe you have better ideas since it is popular in community
Model | Schedule (noise levels) | Schedule (timestep indices) |
---|---|---|
Stable Diffusion 1.5 | [14.615, 6.475, 3.861, 2.697, 1.886, 1.396, 0.963, 0.652, 0.399, 0.152, 0.029] | [999, 850, 736, 645, 545, 455, 343, 233, 124, 24, 0] |
SDXL | [14.615, 6.315, 3.771, 2.181, 1.342, 0.862, 0.555, 0.380, 0.234, 0.113, 0.029] | [999, 845, 730, 587, 443, 310, 193, 116, 53, 13, 0] |
DeepFloyd-IF / Stage-1 | [160.41, 8.081, 3.315, 1.885, 1.207, 0.785, 0.553, 0.293, 0.186, 0.030, 0.006] | [995, 920, 811, 686, 555, 418, 315, 174, 109, 12, 0] |
Stable Video Diffusion | [700.00, 54.5, 15.886, 7.977, 4.248, 1.789, 0.981, 0.403, 0.173, 0.034, 0.002] | NA |
To support this more natively, I think we can extend the timesteps
argument to set_timesteps
method to more schedulers; also extend it to the SVD pipeline they mentioned
I would first figure out what schedulers/pipelines can these optimized steps be applied to and only support these selected schedulers for now.
Initially I thought it only worked for DPM schedulers but I've been testing it in comfyui and they enabled it for all of them, so far it works with all of them but I think that in all the SDE variants it's a lot worse (not usable).
Also in SDXL it's less noticeable but still get the performance gain.
prompt = "anthropomorphic capybara wearing a suit and working with a computer" ays = 10 steps normal = 25 steps
ays | normal | |
---|---|---|
Euler | ||
Heun | ||
dpm_2 | ||
dpmpp_2m | ||
dpmpp_2m_sde_gpu |
I'm curious about whether it works on fewer steps (2 steps, 4 steps, like distillation methods)? How can we derive the optimized timesteps from scratch?
I'm a little bit confused, it says each scheduler would have its own optimized schedule but this is what they provide. Can these timesteps be used for all schedulers for these models?
Hmm yeah that is a bit confusing. I interpreted it as a unique schedule exists for different solvers. From the contributions section of the paper, I guess they mean the optimized schedule is applicable to multiple solvers for a given model type
(iv) We provide the optimized schedules for several commonly used models in the appendix to allow for easy plug-and-play use by the research community
@haofanwang they described their method in the paper - not sure how easy it is to reproduce & how much compute it would cost
Is there away to inference more steps than 10?
@DeFek1 you can use a linear interpolation in between those timestep indices
@wonkyoc how? Is there any code i can try?
The original repo includes the code or you could use numpy or other linear algebra library.
https://research.nvidia.com/labs/toronto-ai/AlignYourSteps/howto.html
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.
Align Your Steps: Optimizing Sampling Schedules in Diffusion Models is a general and principled approach to optimizing the sampling schedules of DMs for high-quality outputs. This work is presented by Nvidia labs in this paper: https://arxiv.org/abs/2404.14507
and the project page is here https://research.nvidia.com/labs/toronto-ai/AlignYourSteps/
the page propose a very small change that have big impact on the inference quality