AUTOMATIC1111 / stable-diffusion-webui

Stable Diffusion web UI
GNU Affero General Public License v3.0
135.73k stars 25.91k forks source link

Per-pixel denoising strength #1706

Open Pfaeff opened 1 year ago

Pfaeff commented 1 year ago

Is your feature request related to a problem? Please describe. Creating variations of images, where some parts should'nt change quite as much, but not remain fixed as when using inpainting. This would be especially useful when using img2img for upscaling and detail enhancement.

Describe the solution you'd like An option to provide a mask by which the denoising strength will be multiplied.

Describe alternatives you've considered Inpainting, but inpainting leaves the masked region completely untouched.

cmp-nct commented 1 year ago

Isn't that already the case ? I expected that the saturation of the mask is used as a modifier on strength.

Ehplodor commented 1 year ago

Isn't that already the case ? I expected that the saturation of the mask is used as a modifier on strength.

No it is not. Until yesterday at least, mask drawn from within the UI is pitch black so obviously only one level of inpainting "strength". And if user provides mask using alpha channel of the image, then ANY non zero alpha value will be considered as mask.

cmp-nct commented 1 year ago

In that case I think that's quite an important addition imho. The denoising value should be multiplied with the mask blackness if that's possible.

Ehplodor commented 1 year ago

FYI, Already asked in https://github.com/AUTOMATIC1111/stable-diffusion-webui/issues/560

Pfaeff commented 1 year ago

That's not the same thing, though.

Ehplodor commented 1 year ago

Agree. That would have been a good start.

juschu commented 7 months ago

Does anyone know if there's a plugin that can already do this? Besides that, I also just want to bump this issue to the top. I think it's an essential feature but after over a year, it's still open. Or is this already implemented and it was just forgotten to close this issue?

catboxanon commented 7 months ago

Well, https://github.com/AUTOMATIC1111/stable-diffusion-webui/pull/14208 sounds like what OP is asking for.

CodeHatchling commented 7 months ago

As the author of https://github.com/AUTOMATIC1111/stable-diffusion-webui/pull/14208, I've looked into whether per-pixel denoising strength is something that could be natively supported by just passing in different data.

The current models and the sampling methods assume a constant noise level throughout the image, and there isn't a way to specify varying noise levels. This is theoretically possible, and could be achieved by training an existing checkpoint while passing in an additional channel (similar to inpainting models, which take image and mask conditioning as additional input).

I've tried simply passing in a varying noise level image to the denoiser, but this simply causes the denoiser to oversmooth the areas without noise. It seems to assume that a certain percent of the content in any area of an image must be noise, even if it isn't.

The branch I wrote tries to emulate a similar effect through interpolating the original image latents with the denoised latents at each step. The blending occurs according to a mask (which you could interpret as a per-pixel denoising strength multiplier). While the effect might not be exactly the same as true per-pixel denoising strength, it is probably as close as you can get without requiring a new network architecture.