mcmonkeyprojects / sd-dynamic-thresholding

Dynamic Thresholding (CFG Scale Fix) for Stable Diffusion (eSwarmUI, ComfyUI, and Auto WebUI)
MIT License
1.1k stars 103 forks source link

What is mimic? #86

Closed szriru closed 8 months ago

szriru commented 8 months ago

I understand why we need cfg scheduler from wiki. Low cfg scale in early steps and higher cfg scale in late steps is usually good for image quality. So this extension let us change cfg scale along steps, right? But what is about mimic cfg? I need a explanation for it. Maybe it should go here? https://github.com/mcmonkeyprojects/sd-dynamic-thresholding/wiki/Usage-Tips#:~:text=(TODO%3A%20Explanation%20of%20what%20these%20are)

Thank you for reading. I'm expecting your answer.

mcmonkey4eva commented 8 months ago

Want concept/math details? here's the original 2022 imagen paper which introduced the idea: https://arxiv.org/pdf/2205.11487.pdf See also this more recent paper, CTRL-F for rescale: https://arxiv.org/pdf/2305.08891.pdf (though their specific method of preference is a bit suboptimal)

Want the short n simple? It tries to calculate the range of values the image-latent would have if it were generated with a lower CFG scale, and then scales the actual image-latent it got down to that scale. But, like, with a bit more math.

Want the technical detail? https://github.com/mcmonkeyprojects/sd-dynamic-thresholding/blob/master/dynthres_core.py#L66-L122 The code has chunks of comments explaining the behavior along the hotpath

szriru commented 8 months ago

Thank you very much.