LuChengTHU / dpm-solver

Official code for "DPM-Solver: A Fast ODE Solver for Diffusion Probabilistic Model Sampling in Around 10 Steps" (Neurips 2022 Oral)
MIT License
1.52k stars 120 forks source link

Unsatisfactory result. #23

Open GuHuangAI opened 1 year ago

GuHuangAI commented 1 year ago

I have trained a 1000-step diffusion model, and get fine results using 1000-step reverse process. This is the original imgs (the two are concatenated): ori2 and this is the generated imgs: generate2

However, when using dpm_solver, i get unsatisfactory or even worse results. Here is the 20-step img: dpm-20

100-step: dpm

What happended and what should i do ?

LuChengTHU commented 1 year ago

Hi @GuHuangAI ,

Could you please provide a detailed example code for using DPM-Solver? (e.g. which algorithm and which hyperparameters)

GuHuangAI commented 1 year ago

Hi @GuHuangAI ,

Could you please provide a detailed example code for using DPM-Solver? (e.g. which algorithm and which hyperparameters)

Thanks to your reply. Actually, i modified the original Unet and designed a mask-controlled diffusion model. Therefore, my model has an additional input, and i modified the "model_fn" function of DPM_Solver. My code is like this: noise_schedule = NoiseScheduleVP(schedule='cosine') `model_fn = model_wrapper( model, noise_schedule, model_type="noise", # or "x_start" or "v" or "score"

model_kwargs=None,

) x_T = torch.randn((mask.shape[0], 3, *mask.shape[2:]), device=mask.device) x_sample = dpm_solver.sample( x_T, mask, steps=20, order=2, skip_type="time_uniform", method="multistep", )` In a wod, i use the mask to generate imgs: Input: mask2 Output: generate2

Since my work is used for industrial projects. I'm sorry that i can not share my code.

LuChengTHU commented 1 year ago

Hi @GuHuangAI ,

Thank you for the detailed settings!

In my opinion, the mask-conditioned diffusion model is the classical "inpainting" problem in diffusion models. The traditional way for solving such problem is to use masked input at each guided sampling procedure. Here is an example for implementing inpainting: https://github.com/LuChengTHU/dpm-solver/blob/main/example_v2/stable-diffusion/scripts/diffedit_inpaint.ipynb

However, a key for implementing inpainting is that we do not use t_start=1.0 as the starting time. Instead, we often use something like t_start=0.5 or 0.6 for the starting time. And the starting value is a stochastic encoding of your mask image at time t_start (please check the above example for details).

In addition, I have another question: does your model use a continuous cosine schedule? If not, please use "discrete" schedule and provide the betas for the NoiseScheduleVP.

GuHuangAI commented 1 year ago

@LuChengTHU , Thanks for your suggestions. I have tried a lot of combinations of superparametes, unfortunately,they all failed. Maybe my model is so sensitive that it can only satisfy with much steps. (-. -)

LuChengTHU commented 1 year ago

Hi @GuHuangAI ,

Have you tried the original DDIM? Can it provide clean samples? If it can not provide clean samples either, then maybe it is related to your sensitive model. Please follow this instruction to check whether dpmsolver is suitable for your model: https://github.com/LuChengTHU/dpm-solver#suggestions-for-choosing-the-hyperparameters

CreamyLong commented 1 year ago

I have the same problem and I did not get clean pictures by using DDIM in guided-diffusion. what is the reason why the DDIM or dpm does not work?