alpha compositing with overlay strategy produces artifacts if faces are close together

nothings commented 1 year ago

Here's an example of broken output:

Here's an illustration of what's going wrong with the algorithm.

This happens because of the alpha compositing strategy: after cutting holes in the original image where all the faces are, then putting the generated tiles underneath so you can "see through" the holes to the images, it turns out that if the rectangular tile for a face is too large, it can extend to be visible through the hole for another face.

In this particular case, it happens because the image is small, so the padding is large relative to the faces. It can also happen from using square tiles with faces that are taller than they are wide; this will add extra packing horizontal.

Although it is possible to tweak the settings on individual images to avoid this, it's a batch processor so it would be better if it just got it right automatically, and it's straightforward to get it right automatically: simply alpha composite each generated face with its mask, rather than cutting all the masks out of a single image.

I implemented a basic version of this myself, but it didn't interact properly with batching so it's probably better if you write it yourself.

Here's the basic compositing step that I used to replace 'apply_overlay', where paste_loc now contains the per-face mask as well as the old values:

def apply_masked_face(face, paste_loc, final_image):
    x, y, w, h, mask = paste_loc
    base_image = Image.new('RGBA', (final_image.width, final_image.height))
    face = images.resize_image(1, face, w, h)
    base_image.paste(face, (x, y))
    face = base_image
    new_mask = ImageChops.multiply(face.getchannel("A"), mask)
    face.putalpha(new_mask)
    final_image = Image.alpha_composite(final_image, face)
    return final_image

The multiply probably isn't necessary, you could just do face.putalpha(mask) instead. I also didn't know what the case being handled for paste_loc=None is, so I didn't handle that.

To do this properly, you'll have to dilate each face mask independently, rather than dilating after merging. You'll still want to build the combined-mask "overlay" image for SD inpainting to work on so that if another face is visible in the padding region, the inpainting doesn't try to converge with it.

Another advantage to this is you can increase the padding size, which means maybe the inpainting can be more coherent, for example it may be able to infer gender and age from the hair and body if the padding extends far enough.

kex0 commented 1 year ago

Hey, nice catch. Your proposed solution seems works flawlessly. Let me know if it's working as you would expect. Thank you.

nothings commented 1 year ago

Thanks, I am in the middle of tuning a new face detector that handles smaller faces, so I don't want to mess that up right now, but I'll pull the fix down and check it soon. Thanks!

nothings commented 1 year ago

Looks great, thanks! 00004--1 0-25yo woman

kex0 / batch-face-swap

alpha compositing with overlay strategy produces artifacts if faces are close together #14