LaurentMazare / diffusers-rs

An implementation of the diffusers api in Rust
Apache License 2.0
539 stars 55 forks source link

Implement Euler Discrete Scheduler #32

Closed mspronesti closed 1 year ago

mspronesti commented 1 year ago

Hi @LaurentMazare, this PR aims at integrating the Euler Discrete Scheduler into this repository, solving the first task mentioned in #23.

The implementation should contain all the features of the python version. Please notice that, differently from the other schedulers currently supported, here init_noise_sigma is not equal to 1.0. Also, this scheduler requires to scale the denoising model input. We might say that the other 3 schedulers supported by diffusers-rs are a "special" case.

Therefore, to run the stable diffusion pipeline with this scheduler, one needs to multiply the initial latents by init_noise_sigma here (Python version)

let mut latents = Tensor::randn(
    &[bsize, 4, sd_config.height / 8, sd_config.width / 8],
    (Kind::Float, unet_device),
) * scheduler.init_noise_sigma();

and to scale the model input after this line (Python version)

let latent_model_input = scheduler.scale_model_input(latent_model_input, timestep);

HF's python version makes everything consistent implementing init_noise_sigma() and scale_model_input() in all the schedulers. If you wish, I can fix the example in this PR (and, thereby, add a scale_model_input method to all the schedulers, which only returns sample when no scaling is needed, i.e. what HF does). Otherwise, I can open a new one.

Overall, I believe this PR is self-contained regardless of the pipeline examples.

EDIT: the above text is a little dense. To sum up:

EDIT 2: I did what I mentioned in the second bullet in this branch. If you agree, I will open a new PR once this gets merged (assuming you find these valuable contributions to diffusers-rs :) )

LaurentMazare commented 1 year ago

Thanks for the PR, I just merged it, adding the scale_model_input step in the examples seems sensible so that such scheduler could more easily be used.

mspronesti commented 1 year ago

Sure, I will open a new PR from that other branch then! :)

mspronesti commented 1 year ago

Done (#35) .

sssemil commented 1 year ago

Hi, I'm getting blurry results with this one. I multiplied latents by scheduler.init_noise_sigma() and also added let latent_model_input = scheduler.scale_model_input(latent_model_input, timestep); after let latent_model_input = Tensor::cat(&[&latents, &latents], 0);. Anything else I could be missing? Thanks.

mspronesti commented 1 year ago

Hi @sssemil, try with this main file (comes from the other PR I opened yesterday). I just cloned the main branch of this repo again, copy-pasted this main (which simply generalizes the stable diffusion pipeline), changed the scheduler at line 245

let scheduler = EulerDiscreteScheduler::new(n_steps, Default::default());

and run the diffusion pipeline as follows

cargo run --example stable-diffusion --features clap -- --prompt "A very rusty robot holding a fire torch." --sd-version=v1-5 --n-steps=15 --seed=91 --cpu all

this is the output I got:

sd_final

You can check whether the produced image is "appropriate for this scheduler" (compared to the python version) using, for instance, the following python snippet

import torch
from diffusers import StableDiffusionPipeline, EulerDiscreteScheduler

pipe = StableDiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5", torch_dtype=torch.float16)

pipe.scheduler = EulerDiscreteScheduler()

pipe = pipe.to("cuda")
prompt = "A very rusty robot holding a fire torch."
image = pipe(prompt, num_inference_steps=15).images[0]  

Let me know if it also works for you :)

sssemil commented 1 year ago

Ah, I see. Then this is just how this method works :) Euler A. just looks better in general.

mspronesti commented 1 year ago

Yes, in my experience Euler A. works better. Did you give a shot to my implementation of that (see PR open)?