Open czkoko opened 9 months ago
Just found this repo... I think this might help: https://github.com/huggingface/diffusers/pull/6477/files
Just found this repo... I think this might help: https://github.com/huggingface/diffusers/pull/6477/files
@GuiyeC @nosferatu500 This seems to be the key to the problem. The picture quality problems encountered by DPM++2M Karras are the same. DPM++ SDE Karras seem to have some differences, and the picture will become hazy.
I'm also using the diffuser repo as a reference for implementing pipeline and schedulers for my own app, but I'm still working with v0.9 (started with v0.3) and v0.26 is too many changes for me, so I don't know if any more changes are needed except for this commit, so I can't do a PR. (yet), but I noticed this commit a while ago.
The algorithm used by DrawThings produces very good picture quality. I don't know if can refer to it. https://github.com/liuliu/swift-diffusion/blob/1f0b2acead80ae98665c0235083893c7fa6da7e5/examples/sdxl_txt2img/main.swift#L719
I made the following changes, but it didn't improve the picture quality.
karrasSigmas.append(karrasSigmas.last!) => karrasSigmas.append(0)
Guernikakit, DPM++ 2M Karras, SDXL, 15 steps:
DrawThings, DPM++ 2M Karras, SDXL, 15 steps:
I'm not sure, but maybe the problem is with the convertToKarras
fun? I can't check it myself, but if I remember correctly, the implementation of the func in Guernika and the ml-stable-diffusion repo is different. But I don't know where the correct one, so... I would look at the implementation in the diffusers repo.
@GuiyeC I found some reasons:
Unet‘s metadata.json -> "userDefinedMetadata" missing "timestep_spacing"
The picture quality of the last second step is obviously better than that of the last step.
The following is a comparison of DPM++ 2M Karras, 15steps, SDXL.
the last second step above, and the last step below.
The following is a comparison of DPM++ 2M, 18steps, SDXL.
the last second step above, and the last step below.
But the last two steps of DPM++ SDE Karras are the same hazy picture quality.
I'm not sure, but maybe the problem is with the
convertToKarras
fun? I can't check it myself, but if I remember correctly, the implementation of the func in Guernika and the ml-stable-diffusion repo is different. But I don't know where the correct one, so... I would look at the implementation in the diffusers repo.
It is the same as the algorithm of ml-stable-diffusion, just do some code refactoring and optimization.
let image = try decodeToImage(latent) => let image = try decodeToImage(scheduler.modelOutputs.last ?? latent)
@GuiyeC Is the DPM++ SDE Karras algorithm of this project different from other's DPM++ SDE Karras algorithms such as ComfyUI?
Current project's DPM++ SDE Karras, SDXL, 20 steps:
Other's DPM++ SDE Karras, SDXL, 12 steps:
DPM++ 2m Karras also has problem. For the turbo model, DPM++ SDE Karras seems to work normally, but in comparison, it requires 2-3 more steps.