Over Grow issue for SDXL inpainting model for outpainting task.

rajdeep2804 commented 9 months ago

I Have been exploring SDXL from sometime now, However I'm not able to understand in inpainting model I'm facing overgrow issue in some cases, I would love to now how I can fix this issue and why does it happen?

DN6 commented 9 months ago

Hi @rajdeep2804 Can you please describe what you mean by overgrow, and provide a code example that we can use to reproduce the issue.

rajdeep2804 commented 9 months ago

Sure @DN6, I'm trying to generate background for with help of prompt, mask image and input image for a particulate objects like shoe, mugs, etc. However sometime model generate random things around the product/mask area and sometimes it generates background perfectly and I'm not sure why does it happens I have tried to resolve it trying different Schedulers, num_inference_steps, guidance_scale, negative_prompt, strength, prompt and seed but this issue still exist, I'm sharing you few image for reference to understand the issue better. I really like to know where I'm going wrong, what is the exact issue and how I can fix this issue. output outputdfsf output1xvdv product_mask_imagesf invert_masksdsd

pipe = AutoPipelineForInpainting.from_pretrained("diffusers/stable-diffusion-xl-1.0-inpainting-0.1", torch_dtype=torch.float16, variant="fp16").to("cuda")
pipe.scheduler = DDIMScheduler.from_config(pipe.scheduler.config)
pipe.to("cuda")
seed = 121434500
generator = torch.Generator(device="cuda").manual_seed(seed)
prompt = "A photorealistic Sandal on a detailed texture white marbel table, in front of window with blue skies,stunning environment, by professional photographer, hyperrealistic, Modernist, sharp focus, 8k, vray, shallow depth of field, Macro lens, Bounce light and soften shadows, Urban settings, Adjust color balance, amazing award winning colored photograph"
negative_prompt = '''out of frame, lowres, text, error, cropped, worst quality, low quality, jpeg artifacts, ugly, duplicate, morbid, blurry, dehydrated, bad anatomy, bad proportions, gross proportions, username, watermark, signature'''
image = Image.open(product_image).convert("RGB").resize((512, 512))
mask_image = Image.open(mask).convert("RGB").resize((512, 512))
image_ = pipe(
    prompt=prompt,
    image=image,
    mask_image=mask_image,
    guidance_scale=10,
    num_inference_steps=20, 
    generator=generator,
    strength=1,
    negative_prompt = negative_prompt
    ).images[0]
image_.save("output.png")

garychan22 commented 9 months ago

I think the issue of Over Grow comes from the learnt prior from the training data, and the inpainting model does not actually recognize what the given product looks like and randomly turns it into another reasonable product in the context.

DN6 commented 9 months ago

@rajdeep2804 Do you observer the same results when setting strength to be less than 1? e.g. between 0.65-0.8?

patrickvonplaten commented 9 months ago

Also cc @yiyixuxu here as this is very similar to the issue of SDv1-5 inpainting

rajdeep2804 commented 9 months ago

Hey @DN6, Yeah if I reduce strength in range 0.65-0.8, inpaint model doesn't generate background.

yiyixuxu commented 9 months ago

hi you can use controlnet inpainting pipeline instead! in addition to the mask, it will let you to use a condition (e.g. depth map) to further enforce the structure of your generation

it will help a lot with this issue :)

thanks

YiYi

rajdeep2804 commented 9 months ago

Hi @yiyi, However if I try with control net, backgrounds are not generated properly with everything keeping same in sd inpaint it gives better results. Can you please let me know where I'm going wrong. ![Uploading output1dfd.png…]() (sd inpainting) (https://github.com/huggingface/diffusers/assets/53480855/4e251b6a-32f2-4d94-a11d-8c528e36ec19) (controlNet)

''' guidance_scale=10 num_samples = 1 seed = int(random.randint(1314344000, 1314345000)) generator = torch.Generator(device="cuda").manual_seed(seed)

negative_prompt = '''nsfw, out of frame, lowres, text, error, cropped, worst quality, low quality, jpeg artifacts, ugly, duplicate, morbid, mutilated, out of frame, extra fingers, mutated hands, poorly drawn hands, poorly drawn face, mutation, deformed, blurry, dehydrated, bad anatomy, bad proportions, extra limbs, cloned face, disfigured, gross proportions, malformed limbs, missing arms, missing legs, extra arms, extra legs, fused fingers, too many fingers, long neck, username, watermark, signature''' output = pipe( "A photorealistic bottle on top of a natural hill and rocks surrounded by flowers with blue skies, stunning environment, by professional photographer, hyperrealistic, shallow depth of field, wide aperture, ultra realistic, highly detailed, natural lighting, ambient lighting, close-up, 8k uhd", image=init_image, mask_image=mask_image, control_image=control_image, guidance_scale=guidance_scale, generator=generator, num_images_per_prompt=num_samples, num_inference_steps = 75, eta = 0.3, negative_prompt = negative_prompt ).images[0] '''

garychan22 commented 8 months ago

you need to erase the background in the condition image based on the foreground mask for background generation aligned with the text prompt.

Brembles commented 8 months ago

嘿@DN6，是的，如果我在 0.65-0.8 范围内降低强度，修复模型不会生成背景。

Hi, I encountered a similar problem to yours. Even when the strength is set to 0.95, the background cannot be generated correctly and is still black. Have you solved this problem?

yiyixuxu commented 8 months ago

hi @Brembles: you will need to set the strength to be 1.0 if you want to generate "new" background - otherwise it will generate a background similar to your masked content, i.e. similar to the mask_content = original optional in auto 1111

github-actions[bot] commented 7 months ago

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

huggingface / diffusers

Over Grow issue for SDXL inpainting model for outpainting task. #5600