madebyollin / taesd

Tiny AutoEncoder for Stable Diffusion
MIT License
580 stars 27 forks source link

Preview for discrete schedulers? #5

Closed skirsten closed 1 year ago

skirsten commented 1 year ago

Hi, I am trying to generate preview images using something similar to https://github.com/madebyollin/taesd/blob/main/examples/Previewing_During_Image_Generation.ipynb.

alphas_cumprod = pipe.scheduler.alphas_cumprod.to(t.device)
alpha_prod_t = alphas_cumprod[int(t)]

pred_original_sample = (latents - (1 - alpha_prod_t) ** 0.5 * noise_pred) / alpha_prod_t**0.5

Unfortunately the formula from this notebook does not work well in the beginning of the discrete schedulers.

Non discrete (works great)

DDIMScheduler

DDIMScheduler webp

DDPMScheduler

DDPMScheduler webp

DPMSolverMultistepScheduler

DPMSolverMultistepScheduler webp

PNDMScheduler

PNDMScheduler webp

Discrete (not so great)

EulerAncestralDiscreteScheduler

EulerAncestralDiscreteScheduler webp

EulerDiscreteScheduler

EulerDiscreteScheduler webp

LMSDiscreteScheduler

LMSDiscreteScheduler webp

I also posted this on a diffusers issue but also wanted to ask here, as this is where I got the formula from.

I know this is not really related with taesd but any help is appreciated!

madebyollin commented 1 year ago

Try this:

def get_pred_original_sample(sched, model_output, timestep, sample):
    return sample - sched.sigmas[(sched.timesteps == timestep).nonzero().item()] * model_output

Test gif with EulerAncestralDiscreteScheduler:

preview_images

(For future reference, the correct pred_original_sample expressions are usually hidden inside the corresponding scheduler code. EDIT: Oh, and also, for some schedulers, if you override kwargs['return_dict'] = True you can just use res.pred_original_sample and don't need get_pred_original_sample at all :P)

skirsten commented 1 year ago

Thank you so much for pointing me in this direction. I actually tried to find the formula in the diffusers codebase but I looked in the DPMSolverMultistepScheduler which does not have it :).

Oh, and also, for some schedulers, if you override kwargs['return_dict'] = True you can just use res.pred_original_sample and don't need get_pred_original_sample at all :P)

I had the same idea after checking the scheduler code you linked :+1: , here is the code I use now:

orig_step = pipe.scheduler.step

def step(model_output, timestep, sample, *args, **kwargs):
    return_dict = kwargs.get("return_dict", True)
    kwargs["return_dict"] = True

    output = orig_step(model_output, timestep, sample, *args, **kwargs)

    images = pipe.vae.decode(output.pred_original_sample / pipe.vae.config.scaling_factor)[0]

    preview_image = pipe.image_processor.postprocess(images)

    preview_images.append(preview_image[0])

    if not return_dict:
        return (output[0],)

    return output

pipe.scheduler.step = step

Unfortunately it does not work for DPMSolverMultistepScheduler and PNDMScheduler but the rest I listed above work. Thank you again so much :smile:

gabgren commented 1 year ago

Try this:

def get_pred_original_sample(sched, model_output, timestep, sample):
    return sample - sched.sigmas[(sched.timesteps == timestep).nonzero().item()] * model_output

Test gif with EulerAncestralDiscreteScheduler:

preview_images preview_images

(For future reference, the correct pred_original_sample expressions are usually hidden inside the corresponding scheduler code. EDIT: Oh, and also, for some schedulers, if you override kwargs['return_dict'] = True you can just use res.pred_original_sample and don't need get_pred_original_sample at all :P)

Hi,

Do you know how to implement this in diffusers pipeline? is it in the callback() from the pipeline call, or elsewhere?

madebyollin commented 1 year ago

@gabgren By "this" do you mean "live previewing"? Live previewing with diffusers is demonstrated in the example notebook. When I checked it seemed like live previewing can't be done with callbacks alone, because pred_original_sample is not accessible from callbacks.