[Feature Request]: Img2Img inpainting/sketching - Non-binary/alpha weighted denoising mask

CodeHatchling commented 10 months ago

Is there an existing issue for this?

[X] I have searched the existing issues and checked the recent builds/commits

What would your feature do ?

Problem to solve

It appears that the denoiser only considers a binary mask (with a hard boundary) with respect to what pixels should be denoised, even with extreme blurring values. Specifically, only if the mask/sketch opacity is greater than 50% does the region under that pixel get denoised. The resulting image and the original image are simply alpha-blended together using the mask opacity values.

Why this is a problem

When inpainting, even with a very high mask blur, a seam will appear at the 50% opacity threshold.
When inpaint-sketching, with any amount of mask blur, the colors of the sketch will bleed into regions of the image that do not recieve denoising. (Without mask blur the results are full of seams.)
Inpaint sketching with 50% mask transparency or more is pointless as nothing is inpainted.
It is difficult to inpaint objects with indefinite boundaries like dust clouds, or in any situation where some kind of gradual seamless transition in texture is needed. In these cases, the original texture is destroyed when it should be partially preserved.

What possibilities solving it brings

Brushes with feathered edges
Compositing images with alpha channels
Depth-related effects if the mask represents a depth map

Proposed solution

Interpret the mask opacity as a per-pixel multiplier for the denoising strength. AFAIK there are a few ways one could achieve this effect:

Perhaps existing models support this implicitly - when any part of the pipeline (noising and denoising) considers the denoising strength parameter, have it examine a denoising value assigned to each 8x8 block of pixels (instead of a single global parameter). E.g. scale the amount of latent noise added, and scale the change to the latent block created by the denoiser at each iteration.
Modify the latent image before and after noising steps - The initial noise that is added to the latent image can be scaled according to each 8x8 block's denoising strength. Then after each step, "pull" each 8x8 block's latent vector back to what it was originally. The amount it gets pulled back depends on the denoising strength of that block.

I believe either of these would allow inpainting objects with partial opacity or very gradual transitions, where content in a transition region is preserved.

Alternate solution: dithering

A simpler option could be to use dithering to decide whether a given pixel/block is masked. In other words, using some kind of dithering pattern (Beyer, blue noise, Floyd–Steinberg) the mask opacity represents a probability a given element of the image is affected by the denoiser.

Alternate solution: adjust mask threshold

An even simpler solution could be to change the mask opacity threshold at which denoising occurs from >=50% to >0%. In other words, if the mask has opacity greater than 0, it is included in the denoising. Then, the original content could be blended over-top to completely hide the seam at the point where the mask has 0 opacity.

However, the main drawback is that ghosting artifacts will appear where both the original and modified image are visible. (Though this is an issue with the current implementation anyway.)

Proposed workflow

Open Img2Img -> inpaint/inpaint sketch, load an image
Select a brush with options for opacity, force/flow and softness. (Mask blur and transparency may be made obsolete by this feature.)
(Optional) Tweak the alpha power slider. Repeated iterations may cause partially masked latent blocks to still have strong modifications, pushing the transition zone to regions with almost no masking. Bringing the mask opacity to a power could help make the transitions more perceptually gradual.
When ready, regenerate the image to observe no seams, gradual transitions and partial preservation of partially masked content, and no color leakage from blurred/soft sketch strokes.

CodeHatchling commented 10 months ago

I've made some progress on implementing this myself.

The following images were generated with a mask with varying blur levels applied:

Original	Masked, 64 blur	Masked, 48 blur	Masked, 32 blur	Masked, 4 blur

The masking process, at each step, interpolates the latent blocks to the original image by some amount based on the mask opacity and denoising step size.

While this works okay, it appears that the denoiser gets a bad estimate of the noise level present in the image at partially masked pixels: Pixels that are partially masked are less noisy, but they are treated as having the same noise level as fully masked pixels, which can lead to the transition areas appearing oversmoothed. I think this is because normally the unmasked pixels are not given any noise.

I think I might be able to improve this now that I know what kind of image the denoiser is expecting. My next idea is:

Start with a uniformly noisy image. Do NOT scale the noise back based on the mask.
At each step, when the denoise() method is called, the denoiser attempts to output a fully denoised image in one step, given the current noise level. Blend this denoised image with the original image according to the mask, effectively simulating a denoiser that outputs exactly what your image had originally in unmasked regions.
The sampler interpolates the denoiser's output (image w/ no noise) with what it had originally (noisy image). Since at each step, the noise level is uniform, there is no risk of the denoiser oversmoothing.
At intermediate steps, the denoiser will receive an image with a blend of the original content and its denoised content. Thus its subsequent denoising attempts should adapt to the blended original content like it were something it generated itself.
At the final step when the noise level reaches 0, the original unmaksed content will be blended in with no noise, even though it was originally noised away.

CodeHatchling commented 10 months ago

I basically solved this problem completely ::}

When can I push? ;;}

CodeHatchling commented 10 months ago

Another example ::}

CodeHatchling commented 10 months ago

https://github.com/CodeHatchling/stable-diffusion-webui-soft-inpainting

Code cleaned up and forked over to here

catboxanon commented 10 months ago

Great work on this!

When can I push? ;;}

Feel free to make a PR whenever you think this is ready. You may need to account for merge conflicts since your fork was based on the master branch and not the dev branch though.

AUTOMATIC1111 / stable-diffusion-webui