Open KeyaoZhao opened 1 year ago
I think I have the same question. I would like to add objects to a preexisting photo scene. It seems onerous to have to define some mask, I'd want the added objects simply placed "organically" in the correct/plausible locations.
I think I have the same question. I would like to add objects to a preexisting photo scene. It seems onerous to have to define some mask, I'd want the added objects simply placed "organically" in the correct/plausible locations.
I have tried to set the 'support_pil_img'=ori_img without mask and 'inpainting_mask' = One channel mask image. But the result of if_II_kwargs is totally the same as 'support_pil_img', the prompt has no influence on the output? So how should I fix this?
Hello, I also have the same problem. I have tried to make the shape of the mask to be [h,w], [h,w,3], and [1,h,w,3] but failed all the cases. Did you figure out what the data type and shape the mask should be?
Hello, I also have the same problem. I have tried to make the shape of the mask to be [h,w], [h,w,3], and [1,h,w,3] but failed all the cases. Did you figure out what the data type and shape the mask should be?
I still have no idea how to have the same effect as the example inpainting. But if you want to add text to the image, you can try TextDiffuser.
Thanks. If I figure out how to make it, I will share it here.
Hello, I also have the same problem. I have tried to make the shape of the mask to be [h,w], [h,w,3], and [1,h,w,3] but failed all the cases. Did you figure out what the data type and shape the mask should be?
I managed to make it work after a deep look in the code. What you should provide is a mask of torch.FloatTensor
shape [1, 3, h, w]
. Set the mask values to 1 where you want the model to modify the image, and 0 where the model should leave the pixels untouched.
Now, in order for this solution to work properly, you'll need to apply the patch available in pull request #64 .
Furthermore, if your image has an aspect ratio that is not well-rounded, the shape of the generated image in the first stage may differ from the shape of the mask and support noise. To address this issue, I have proposed a fix in pull request #125 .
I wonder what is inpainting_mask in the use of Zero-shot Inpainting? We should mask the raw_pil_image first? And the model will inpaint the mask part? Thanks a lot!