sail-sg / EditAnything

Edit anything in images powered by segment-anything, ControlNet, StableDiffusion, etc. (ACM MM)
Apache License 2.0
3.34k stars 193 forks source link

Question: can I apply a partial segmentation control image? #29

Closed DJBen closed 1 year ago

DJBen commented 1 year ago

Summary

I would like to achieve some masked inpainting, but want to limit the segmentation to only a part of the image, and left regions out of the segmentation mask without constraints.

Motivation

The motivation comes from the pains of regular inpainting - if I only supply a mask and a prompt without further constraints, the model will usually add unwanted features to the subject. See below example, the inpainted image grows extra legs on the dog.

Prompt: "dog on the beach, best quality"

Original image from segment-anything example Mask Inpainted image with controlnet-inpaint
Doggo mask_image 00003-2920368284

To prevent this, I am thinking of using a segment-mask control net to constrain the extent of the primary subject. I use the following script to assign the masked region as segment 1, and other region as segment 0, but it results in the model recognizing everything else uniformly. See produced image.

        def mask_to_control(mask):
            # mask is a binary mask after HWC3(...)
            res = np.zeros((mask.shape[0], mask.shape[1], 3))
            res[:, :, 0] = np.where(mask[:, :, 0] >= 128, 0, 1)
            return res

881af9747e4b4e6f81ee59032770e35e

I would like the image not be bounded by the segmentation mask outside of the masked region, is that possible? I am still new with controlnet. Please also don't hesitate to point out any mistakes. Glad to hear from all of ya!

ikuinen commented 1 year ago

Definitely yes. That is a good point. A simple way is to give a lower mask alignment strength, which let the models generate the region outside of the mask more freely. Another way is to give pixel-wise control scales instead of a single scalar.

DJBen commented 1 year ago

@ikuinen thank you for your quick reply! I tried lower mask alignment strength, and it is not very optimal - it sometimes will grow extra legs, while not diversifying the background much.

Another way is to give pixel-wise control scales instead of a single scalar.

Can you elaborate on this approach? Which parameter should I supply? 🙏

ikuinen commented 1 year ago

Thanks for your interest. The mask alignment strength is a control scalar, while I think you could use a heatmap-like control scale. The value in region of dog could be one. The area around the dog could be lower and the rest could be zero. Just a simple idea and haven't tried yet. This may be a further feature that can be added into EditAnything.

DJBen commented 1 year ago

@ikuinen Thank you for your insight - I would like to experiment your approach, but I didn't locate a way to provide a heatmap as you described.

In StableDiffusionControlNetInpaintPipeline, __call__: I think control_net_conditioning_scale is a float or List[float], and the alignment_ratio looks like a Optional[float].

Could you give a code pointer of how this is achieved? Much thanks!

ikuinen commented 1 year ago

Hi~@DJBen, you could checkout the dev branch. I have implemented a version to control the scale in pixel-wise style, named controlnet_conditioning_scale_map. By simply setting a higher scale around the dog while zero scale for the rest, the problem could be solved in a way.

Prompt: "a dog in beach, photorealistic, best quality, extremely detailed" Original image Mask Inpainted image Control scale map
Doggo mask_image dog_res1 control_map
DJBen commented 1 year ago

That is amazing. You are awesome. I'll try it at my earliest convenience.