huggingface / diffusers

🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch and FLAX.
https://huggingface.co/docs/diffusers
Apache License 2.0
26.19k stars 5.4k forks source link

EulerAncestralDiscreteScheduler is generating weird results when used with StableDiffusionControlNetInpaintPipeline #7348

Closed nayan-dhabarde closed 8 months ago

nayan-dhabarde commented 8 months ago

Describe the bug

I was trying the demo from https://huggingface.co/docs/diffusers/en/using-diffusers/controlnet#inpainting

Only change I made was that I have used a different scheduler, i.e. instead of UniPCMultistepScheduler I have used EulerAncestralDiscreteScheduler

And the results are horrible, here is the code: from diffusers import StableDiffusionControlNetInpaintPipeline, ControlNetModel, EulerAncestralDiscreteScheduler

controlnet = ControlNetModel.from_pretrained("lllyasviel/control_v11p_sd15_inpaint", torch_dtype=torch.float16, use_safetensors=True)

pipe = StableDiffusionControlNetInpaintPipeline.from_pretrained(
    "runwayml/stable-diffusion-v1-5", controlnet=controlnet, torch_dtype=torch.float16, use_safetensors=True
)

# Setup VAE
vae = AutoencoderKL.from_pretrained("stabilityai/sd-vae-ft-mse", torch_dtype=torch.float16)
pipe.vae = vae

    # Setup scheduler
pipe.scheduler = EulerAncestralDiscreteScheduler.from_config(pipe.scheduler.config)
pipe.enable_model_cpu_offload()
output = pipe(
    "corgi face with large ears, detailed, pixar, animated, disney",
    num_inference_steps=20,
    eta=1.0,
    image=init_image,
    mask_image=mask_image,
    control_image=control_image,
).images[0]
make_image_grid([init_image, mask_image, output], rows=1, cols=3)

Everything else is same as the tutorial: Here are the results: image Am I missing something?

Reproduction

from diffusers.utils import load_image, make_image_grid

init_image = load_image( "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/controlnet-inpaint.jpg" ) init_image = init_image.resize((512, 512))

mask_image = load_image( "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/controlnet-inpaint-mask.jpg" ) mask_image = mask_image.resize((512, 512)) make_image_grid([init_image, mask_image], rows=1, cols=2)

import numpy as np import torch

def make_inpaint_condition(image, image_mask): image = np.array(image.convert("RGB")).astype(np.float32) / 255.0 image_mask = np.array(image_mask.convert("L")).astype(np.float32) / 255.0

assert image.shape[0:1] == image_mask.shape[0:1]
image[image_mask > 0.5] = -1.0  # set as masked pixel
image = np.expand_dims(image, 0).transpose(0, 3, 1, 2)
image = torch.from_numpy(image)
return image

control_image = make_inpaint_condition(init_image, mask_image)

from diffusers import StableDiffusionControlNetInpaintPipeline, ControlNetModel, EulerAncestralDiscreteScheduler

controlnet = ControlNetModel.from_pretrained("lllyasviel/control_v11p_sd15_inpaint", torch_dtype=torch.float16, use_safetensors=True) pipe = StableDiffusionControlNetInpaintPipeline.from_pretrained( "runwayml/stable-diffusion-v1-5", controlnet=controlnet, torch_dtype=torch.float16, use_safetensors=True )

Setup VAE

vae = AutoencoderKL.from_pretrained("stabilityai/sd-vae-ft-mse", torch_dtype=torch.float16) pipe.vae = vae

# Setup scheduler

pipe.scheduler = EulerAncestralDiscreteScheduler.from_config(pipe.scheduler.config) pipe.enable_model_cpu_offload()

output = pipe( "corgi face with large ears, detailed, pixar, animated, disney", num_inference_steps=20, eta=1.0, image=init_image, mask_image=mask_image, control_image=control_image, ).images[0] make_image_grid([init_image, mask_image, output], rows=1, cols=3)

Logs

No response

System Info

RTX 3090 24GB, Runpod instance running ubuntu

Who can help?

@sayakpaul @yiyixuxu

Diffusers version?

v0.27.0

jasstionzyf commented 8 months ago

StableDiffusionInpaintPipeline also has same issues when update to version: 2.7.0. But previously versions like 2.6.x work normal. Also only not special checkpoints has problems, inpainting models like sd1.5 inpainting has no issues

nayan-dhabarde commented 8 months ago

I did checked v0.26.0 works fine. Later versions have problems

sayakpaul commented 8 months ago

Cc: @yiyixuxu/

With an earlier version (without changing any code), you're getting the expected results?

nayan-dhabarde commented 8 months ago

Yes, it is working correctly for 0.26.0, haven't tried any intermediate versions.

tolgacangoz commented 8 months ago

I wonder if this resolves the issue completely 🤔:

-if self.begin_index is None:
+if self.begin_index is None or self.begin_index == 0:
    step_indices = [self.index_for_timestep(t, schedule_timesteps) for t in timesteps]
else:
    step_indices = [self.begin_index] * timesteps.shape[0]

The results seem proper for EulerDiscreteScheduler and EulerAncestralDiscreteScheduler rather than full noise and better for UniPCMultistepScheduler and DPMSolverMultistepScheduler.

jasstionzyf commented 8 months ago

still has issues compare with version 0.26.0 Previously results: WechatIMG1895

WechatIMG1896

this fix version results: WechatIMG1897

WechatIMG1898

@yiyixuxu

yiyixuxu commented 8 months ago

@jasstionzyf this gives me identical results between v026 and the current main. Would you be able to provide a script?

from diffusers.utils import load_image, make_image_grid

init_image = load_image(
    "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/controlnet-inpaint.jpg"
)
init_image = init_image.resize((512, 512))

mask_image = load_image(
    "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/controlnet-inpaint-mask.jpg"
)
mask_image = mask_image.resize((512, 512))
make_image_grid([init_image, mask_image], rows=1, cols=2)

import numpy as np
import torch

def make_inpaint_condition(image, image_mask):
    image = np.array(image.convert("RGB")).astype(np.float32) / 255.0
    image_mask = np.array(image_mask.convert("L")).astype(np.float32) / 255.0

    assert image.shape[0:1] == image_mask.shape[0:1]
    image[image_mask > 0.5] = -1.0  # set as masked pixel
    image = np.expand_dims(image, 0).transpose(0, 3, 1, 2)
    image = torch.from_numpy(image)
    return image

control_image = make_inpaint_condition(init_image, mask_image)

from diffusers import StableDiffusionControlNetInpaintPipeline, ControlNetModel, EulerAncestralDiscreteScheduler, UniPCMultistepScheduler

controlnet = ControlNetModel.from_pretrained("lllyasviel/control_v11p_sd15_inpaint", torch_dtype=torch.float16, use_safetensors=True)
pipe = StableDiffusionControlNetInpaintPipeline.from_pretrained(
    "runwayml/stable-diffusion-v1-5", controlnet=controlnet, torch_dtype=torch.float16, use_safetensors=True
)

pipe.scheduler = EulerAncestralDiscreteScheduler.from_config(pipe.scheduler.config)
#pipe.scheduler = UniPCMultistepScheduler.from_config(pipe.scheduler.config)
pipe.enable_model_cpu_offload()
genrator = torch.Generator(device="cpu").manual_seed(33)

output = pipe(
    "corgi face with large ears, detailed, pixar, animated, disney",
    num_inference_steps=20,
    eta=1.0,
    image=init_image,
    mask_image=mask_image,
    generator=genrator,
    control_image=control_image,
).images[0]
make_image_grid([init_image, mask_image, output], rows=1, cols=3).save(f"yiyi_test_out_v026.png")

v026 yiyi_test_out_v026

v027(main) yiyi_test_out_v027

jasstionzyf commented 7 months ago

@yiyixuxu Thanks for your patiences, my mistake, now it works as expected.

crapthings commented 7 months ago

I wanted invert mask, but the result turned out like this.

and the color seems changed

4401710927696_ pic

`

4411710928181_ pic

from diffusers.utils import load_image, make_image_grid
from transparent_background import Remover
from PIL import Image, ImageOps

from utils import extract_origin_pathname, upload_image, rounded_size, open_url, resize_image, upload_json

limit = 1024

p = Remover(device = 'cuda', ckpt = './original.pth')

init_image = load_image('./ben-iwara-KMy-lgpQTfw-unsplash.jpg')

init_image = resize_image(init_image, limit, limit)

# display(init_image)

# init_image = init_image.resize((512, 512))

mask_image = p.process(init_image, type = 'map')

mask_image = ImageOps.invert(mask_image)

# mask_image = mask_image.resize((512, 512))
# make_image_grid([init_image, mask_image], rows=1, cols=2)

import numpy as np
import torch

def make_inpaint_condition(image, image_mask):
    image = np.array(image.convert("RGB")).astype(np.float32) / 255.0
    image_mask = np.array(image_mask.convert("L")).astype(np.float32) / 255.0

    assert image.shape[0:1] == image_mask.shape[0:1]
    image[image_mask > 0.5] = -1.0  # set as masked pixel
    image = np.expand_dims(image, 0).transpose(0, 3, 1, 2)
    image = torch.from_numpy(image)
    return image

control_image = make_inpaint_condition(init_image, mask_image)

from diffusers import StableDiffusionControlNetInpaintPipeline, ControlNetModel, EulerAncestralDiscreteScheduler, UniPCMultistepScheduler

controlnet = ControlNetModel.from_pretrained("lllyasviel/control_v11p_sd15_inpaint", torch_dtype=torch.float16, use_safetensors = True)

pipe = StableDiffusionControlNetInpaintPipeline.from_pretrained(
    "runwayml/stable-diffusion-v1-5", controlnet=controlnet, torch_dtype=torch.float16, use_safetensors=True, safety_checker = None
)

# pipe.scheduler = EulerAncestralDiscreteScheduler.from_config(pipe.scheduler.config)
pipe.scheduler = UniPCMultistepScheduler.from_config(pipe.scheduler.config)
pipe.enable_model_cpu_offload()
# genrator = torch.Generator(device="cpu").manual_seed(33)

output = pipe(
    "a photo of a outdoor, beach",
    num_inference_steps=20,
    eta=1.0,
    image=init_image,
    mask_image=mask_image,
    # generator=genrator,
    control_image=control_image,
).images[0]

make_image_grid([init_image, mask_image, output], rows=1, cols=3)
yiyixuxu commented 7 months ago

@crapthings are you specifically seeing a different result with the same code when compared with a earlier diffusers version (i.e. v0.26), or are you just not happy with the result? if it is the latter, can you open a new issue?

crapthings commented 7 months ago

@yiyixuxu the result in right looks very orange

okay will test