crowsonkb / k-diffusion

Karras et al. (2022) diffusion models for PyTorch
MIT License
2.21k stars 371 forks source link

LMS sampler question #76

Closed catboxanon closed 4 months ago

catboxanon commented 12 months ago

Hi Katherine,

Wondering if something is wrong with the LMS sampler. We've had several issues open for quite a long time in the stable-diffusion-webui repo but were never really investigated until recently. DPM2 was resolved a long while back but LMS was not.

DPM2: https://github.com/crowsonkb/k-diffusion/issues/43, https://github.com/AUTOMATIC1111/stable-diffusion-webui/pull/5797 LMS: https://github.com/AUTOMATIC1111/stable-diffusion-webui/issues/1973, https://github.com/AUTOMATIC1111/stable-diffusion-webui/issues/7244

Below is an example of a result produced currently by LMS using the default scheduler in the webui. PNG info is embedded in the image to reference if needed.

lms1

I had attempted to make a PR in https://github.com/AUTOMATIC1111/stable-diffusion-webui/pull/12349 to resolve this. The way I did this was by discarding the penultimate sigma and to also use the penultimate latent (what the last callback would return for the last step). A naive approach I'm sure, but this is that result:

lms2

@AUTOMATIC1111 reviewed this PR and had suggested the issue stems from the last step combining the previous 4 steps, and suggested a fix.

https://github.com/AUTOMATIC1111/stable-diffusion-webui/pull/12349#issuecomment-1666721704

After going into that function with a debugger, it looks like it combines denoised image from 4 last sampling steps to produce the next denoised image. cur_order variable determines how many previous steps to combine - it starts at 1, and grows to 4, and stays at 4. The last step where it combines four last denoised images into the final result produces those artifacts. I found that making cur_order go back to 1 at the end of sampling instead of staying at 4 improves the result dramatically. The relevant change is from cur_order = min(i + 1, order) to cur_order = min(i + 1, order, len(sigmas) - i - 1).

When instead only applying that change, I get this result:

lms3

Both of these modifications are clearly better than how the current code behaves but neither of us are sure what is correct here. Hoping you could provide some explanation on whether this is expected or not and how it should be properly resolved.

Olk14 commented 9 months ago

I've had this issue since June and have tried everuthing to resolve it.

catboxanon commented 4 months ago

Closing as stale.