Closed CatLoves closed 1 month ago
Update diffusers-cli env info:
I don't think we need to default to {}
here because it becomes effective only when we're using SDXL models. WDYT @yiyixuxu?
I don't think we need to default to
{}
here because it becomes effective only when we're using SDXL models. WDYT @yiyixuxu?
Yes, you are right. Thank you for quick response! But this naturally lead to one question: when the user use SDXL model, why does the code raise the ValueError exception? The minimal replication code snippet is the following:
from diffusers import AutoPipelineForInpainting, StableDiffusionControlNetInpaintPipeline,ControlNetModel from diffusers.utils import load_image import torch
""" build diffusion model pipeline """ controlnet_canny = ControlNetModel.from_pretrained( "xinsir/controlnet-canny-sdxl-1.0", torch_dtype=torch.float16 ) print(f"=> controlnet_canny is ready") pipe = StableDiffusionControlNetInpaintPipeline.from_pretrained( "diffusers/stable-diffusion-xl-1.0-inpainting-0.1", controlnet=controlnet_canny, torch_dtype=torch.float16, variant="fp16", ).to("cuda")
img_url = "https://raw.githubusercontent.com/CompVis/latent-diffusion/main/data/inpainting_examples/overture-creations-5sI6fQgYIuo.png" mask_url = "https://raw.githubusercontent.com/CompVis/latent-diffusion/main/data/inpainting_examples/overture-creations-5sI6fQgYIuo_mask.png"
image = load_image(img_url).resize((1024, 1024)) mask_image = load_image(mask_url).resize((1024, 1024)) canny_image = get_canny_edge(image)
prompt = "a tiger sitting on a park bench" generator = torch.Generator(device="cuda").manual_seed(0)
generated_image = pipe(
prompt=prompt,
image=image,
mask_image=mask_image,
control_image=canny_image,
guidance_scale=8.0,
num_inference_steps=20, # steps between 15 and 30 work well for us
strength=0.99, # make sure to use strength
below 1.0
generator=generator,
).images[0]
The above code would encounter the ValueError as added_cond_kwargs is None by default. So, currently, diffusers just doesn't support diffusers/stable-diffusion-xl-1.0-inpainting-0.1 + xinsir/controlnet-canny-sdxl-1.0 pipeline ?Or this is just a bug of diffusers ? BTW: The above code would be OK if I just use lllyasviel/control_v11p_sd15_inpaint + lllyasviel/control_v11p_sd15_canny pipeline.
You are not using the right class. You need to use this: https://github.com/huggingface/diffusers/blob/413604405fddb4692a8e9a9a9fb6c353d22881ea/src/diffusers/pipelines/controlnet/pipeline_controlnet_inpaint_sd_xl.py#L154
You are not using the right class. You need to use this:
Yes, you are right, after switching to StableDiffusionXLControlNetInpaintPipeline, the code just runs fine. Thanks again for your kind and quick response! I would close this issue now.
Describe the bug
Version: [0.28.0.dev0] Bug description: In ControlNetModel forward function, added_cond_kwargs arg's default value should be {} rather than None Reason: in the source code(controlnet.py): elif self.config.addition_embed_type == "text_time": if "text_embeds" not in added_cond_kwargs: raise ValueError( f"{self.class} has the config param
addition_embed_type
set to 'text_time' which requires the keyword argumenttext_embeds
to be passed inadded_cond_kwargs
" ) If the default value is None, this code would raise a ValueError Pull request: I intend to submit a pull request for this issue.Reproduction
from diffusers import AutoPipelineForInpainting, StableDiffusionControlNetInpaintPipeline,ControlNetModel from diffusers.utils import load_image import torch
controlnet_canny = ControlNetModel.from_pretrained( "xinsir/controlnet-canny-sdxl-1.0", torch_dtype=torch.float16 ) print(f"=> controlnet_canny is ready") pipe = StableDiffusionControlNetInpaintPipeline.from_pretrained( "diffusers/stable-diffusion-xl-1.0-inpainting-0.1", controlnet=controlnet_canny, torch_dtype=torch.float16, variant="fp16", ).to("cuda")
img_url = "https://raw.githubusercontent.com/CompVis/latent-diffusion/main/data/inpainting_examples/overture-creations-5sI6fQgYIuo.png" mask_url = "https://raw.githubusercontent.com/CompVis/latent-diffusion/main/data/inpainting_examples/overture-creations-5sI6fQgYIuo_mask.png"
image = load_image(img_url).resize((1024, 1024)) mask_image = load_image(mask_url).resize((1024, 1024)) canny_image = get_canny_edge(image)
prompt = "a tiger sitting on a park bench" generator = torch.Generator(device="cuda").manual_seed(0)
generated_image = pipe( prompt=prompt, image=image, mask_image=mask_image, control_image=canny_image, guidance_scale=8.0, num_inference_steps=20, # steps between 15 and 30 work well for us strength=0.99, # make sure to use
strength
below 1.0 generator=generator, ).images[0]Logs
No response
System Info
Who can help?
@sayakpaul @yiyixuxu