Closed nothings closed 1 year ago
Hey, nice catch. Your proposed solution seems works flawlessly. Let me know if it's working as you would expect. Thank you.
Thanks, I am in the middle of tuning a new face detector that handles smaller faces, so I don't want to mess that up right now, but I'll pull the fix down and check it soon. Thanks!
Looks great, thanks!
Here's an example of broken output:
Here's an illustration of what's going wrong with the algorithm.
This happens because of the alpha compositing strategy: after cutting holes in the original image where all the faces are, then putting the generated tiles underneath so you can "see through" the holes to the images, it turns out that if the rectangular tile for a face is too large, it can extend to be visible through the hole for another face.
In this particular case, it happens because the image is small, so the padding is large relative to the faces. It can also happen from using square tiles with faces that are taller than they are wide; this will add extra packing horizontal.
Although it is possible to tweak the settings on individual images to avoid this, it's a batch processor so it would be better if it just got it right automatically, and it's straightforward to get it right automatically: simply alpha composite each generated face with its mask, rather than cutting all the masks out of a single image.
I implemented a basic version of this myself, but it didn't interact properly with batching so it's probably better if you write it yourself.
Here's the basic compositing step that I used to replace 'apply_overlay', where paste_loc now contains the per-face mask as well as the old values:
The multiply probably isn't necessary, you could just do
face.putalpha(mask)
instead. I also didn't know what the case being handled for paste_loc=None is, so I didn't handle that.To do this properly, you'll have to dilate each face mask independently, rather than dilating after merging. You'll still want to build the combined-mask "overlay" image for SD inpainting to work on so that if another face is visible in the padding region, the inpainting doesn't try to converge with it.
Another advantage to this is you can increase the padding size, which means maybe the inpainting can be more coherent, for example it may be able to infer gender and age from the hair and body if the padding extends far enough.