huggingface / diffusers

🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch and FLAX.
https://huggingface.co/docs/diffusers
Apache License 2.0
26.22k stars 5.4k forks source link

How should I create a ControlNet without a model using diffusers and specify its preprocessor as reference_adain+attn #3612

Closed vvlookman closed 1 year ago

vvlookman commented 1 year ago

Is your feature request related to a problem? Please describe. I need to implement the same function in diffusers as ControlNet in SD WebUI with reference_adain+attn preprocessor specified, but I can't find the available method.

Describe the solution you'd like I can create a ControlNet without a model and I can specify its preprocessor as reference_adain+attn.

image
kadirnar commented 1 year ago

https://github.com/huggingface/diffusers/tree/main/examples/community#stable-diffusion-controlnet-reference

vvlookman commented 1 year ago

https://github.com/huggingface/diffusers/tree/main/examples/community#stable-diffusion-controlnet-reference

Thank you for your reply!

I tried the new class StableDiffusionControlNetReferencePipeline, but got an error:

TypeError: Transformer2DModel.forward() got an unexpected keyword argument 'attention_mask'

Since the StableDiffusionControlNetReferencePipeline that the example depends on is in the main branch and not yet officially released, I used diffusers from v0.16.1 plus this class, not sure if there are any version inconsistencies.

vvlookman commented 1 year ago

https://github.com/huggingface/diffusers/tree/main/examples/community#stable-diffusion-controlnet-reference

And there is another problem, the pipeline also needs to specify that the controlnet parameter, but the reference preproccessor is model free and I do not want to use a pretrained controlnet model.

I see that there is another StableDiffusionReferencePipeline class, is this consistent with the controlnet in the SD WebUI that does not specify a model?

sayakpaul commented 1 year ago

I don't understand your questions fully.

You can initialize a ControlNetModel like so:

from diffusers import ControlNetModel

controlnet = ControlNetModel.from_pretrained("lllyasviel/sd-controlnet-canny")

If you don't want to initialize from a pre-trained checkpoint, that is also possible. You can check out the tests from here:

https://github.com/huggingface/diffusers/blob/4f14b363297cf8deac3e88a3bf31f59880ac8a96/tests/pipelines/controlnet/test_controlnet.py#L117

But if the model needs modification, then you will have to start with https://github.com/huggingface/diffusers/blob/main/src/diffusers/models/controlnet.py

kadirnar commented 1 year ago

https://github.com/huggingface/diffusers/tree/main/examples/community#stable-diffusion-controlnet-reference

And there is another problem, the pipeline also needs to specify that the controlnet parameter, but the reference preproccessor is model free and I do not want to use a pretrained controlnet model.

I see that there is another StableDiffusionReferencePipeline class, is this consistent with the controlnet in the SD WebUI that does not specify a model?

You have to use controlnet. https://github.com/Mikubill/sd-webui-controlnet/discussions/1236

vvlookman commented 1 year ago

https://github.com/huggingface/diffusers/tree/main/examples/community#stable-diffusion-controlnet-reference

And there is another problem, the pipeline also needs to specify that the controlnet parameter, but the reference preproccessor is model free and I do not want to use a pretrained controlnet model. I see that there is another StableDiffusionReferencePipeline class, is this consistent with the controlnet in the SD WebUI that does not specify a model?

You have to use controlnet. Mikubill/sd-webui-controlnet#1236

So how should I create this ControlNet? A pre-trained model does not need to be specified in the ControlNet tab in the SD WebUI:

image
kadirnar commented 1 year ago

Controlnet + Canny Preprocces + Reference_attn + Reference_adain:

import cv2
import torch
import numpy as np
from PIL import Image
from diffusers import UniPCMultistepScheduler
from diffusers.utils import load_image

input_image = load_image("https://hf.co/datasets/huggingface/documentation-images/resolve/main/diffusers/input_image_vermeer.png")

# get canny image
image = cv2.Canny(np.array(input_image), 100, 200)
image = image[:, :, None]
image = np.concatenate([image, image, image], axis=2)
canny_image = Image.fromarray(image)

controlnet = ControlNetModel.from_pretrained("lllyasviel/control_v11p_sd15_canny", torch_dtype=torch.float16)
pipe = StableDiffusionControlNetReferencePipeline.from_pretrained(
       "runwayml/stable-diffusion-v1-5",
       controlnet=controlnet,
       safety_checker=None,
       torch_dtype=torch.float16
       ).to('cuda:0')

pipe.scheduler = UniPCMultistepScheduler.from_config(pipe.scheduler.config)

result_img = pipe(ref_image=input_image,
      prompt="1girl",
      image=canny_image,
      num_inference_steps=20,
      reference_attn=True,
      reference_adain=True).images[0]

Controlnet Other Modules: https://huggingface.co/lllyasviel

vvlookman commented 1 year ago

So what should I do if I want to remove the "Canny model" in this sample?

When I use SD WebUI, no pre-trained model need to be specified in the ControlNet tab if choose reference preprocessor

sayakpaul commented 1 year ago

You can pass a ControlNetModel of your choosing as long as it's compatible. You will have to also ensure that you're using the right preprocessor to preprocess the conditioning image.

Please refer to the docs for more details: https://huggingface.co/docs/diffusers/main/en/api/pipelines/stable_diffusion/controlnet

vvlookman commented 1 year ago

I think I may still not have described the problem clearly.

My question is, when I select the reference_adain+attn preprocessor in the ControNet tab in the SD WebUI, how should I create this StableDiffusionControlNetReferencePipeline corresponding to diffusers? Obviously there is no pre-trained model for the ControlNet specified here.

If I need to build the ControlNetModel manually, how should I set the parameters, I want to be able to match the effect set in the SD WebUI.

image
sayakpaul commented 1 year ago

Cc @williamberman for the above.

bharathithal commented 1 year ago

Can anybody tell how to resolve the issue @vvlookman rasied about TypeError: Transformer2DModel.forward() got an unexpected keyword argument 'attention_mask' ? Any version incompatibility or something? I'm trying to run this on a google colab notebook! I am using the same example shown in the community

vvlookman commented 1 year ago

Can anybody tell how to resolve the issue @vvlookman rasied about TypeError: Transformer2DModel.forward() got an unexpected keyword argument 'attention_mask' ? Any version incompatibility or something? I'm trying to run this on a google colab notebook! I am using the same example shown in the community

Installing diffusers directly from the master branch can work around this problem

williamberman commented 1 year ago

Sorry I'm not familiar with reference_adain+attn preprocessor is this webui using diffusers or the original controlnet implementationg?

galaturka commented 1 year ago

Hello,

I try to implement StableDiffusionControlNetReferencePipeline class, but during execution of pipe function, I get below given error in Colab:

image

Seems like example code does not have "_default_height_width" attribute. Then I tried to give image width and height manually as 512, 512. Now I get this error :

image

Did you get similar errors during implementations ? Any suggestions would be appreciated.

sayakpaul commented 1 year ago

Cc: @kadirnar, could you maybe check here?

kadirnar commented 1 year ago

Cc: @kadirnar, could you maybe check here?

I fixed the error. Also the readme.md file is wrong.

galaturka commented 1 year ago

It works! Thanks a lot for support.

sayakpaul commented 1 year ago

@vvlookman could we close the issue here?

vvlookman commented 1 year ago

Is that StableDiffusionControlNetReferencePipeline available now? I'm not sure if the pipeline will solve my problem.

@vvlookman could we close the issue here?

sayakpaul commented 1 year ago

@kadirnar for the ^.

kadirnar commented 1 year ago

Is that StableDiffusionControlNetReferencePipeline available now? I'm not sure if the pipeline will solve my problem.

@vvlookman could we close the issue here?

Stable Diffusion + Reference Pipeline: https://github.com/huggingface/diffusers/tree/main/examples/community#stable-diffusion-reference

Stable Diffusion + Controlnet + Reference Pipeline: https://github.com/huggingface/diffusers/tree/main/examples/community#stable-diffusion-controlnet-reference

kadirnar commented 1 year ago

I think I may still not have described the problem clearly.

My question is, when I select the reference_adain+attn preprocessor in the ControNet tab in the SD WebUI, how should I create this StableDiffusionControlNetReferencePipeline corresponding to diffusers? Obviously there is no pre-trained model for the ControlNet specified here.

If I need to build the ControlNetModel manually, how should I set the parameters, I want to be able to match the effect set in the SD WebUI.

image

This image isn't using the controlnet model. It just makes the preprocces process active.

andysingal commented 1 year ago
!pip install diffusers==0.18.2
!pip install scipy ftfy accelerate
!pip install git+https://github.com/huggingface/transformers
from diffusers import  LMSDiscreteScheduler,DiffusionPipeline,StableDiffusionPipeline,ControlNetModel
import torch
import transformers
from transformers import AutoProcessor, AdamW
from transformers import BlipForConditionalGeneration

from huggingface_hub import notebook_login

notebook_login()

# this will substitute the default PNDM scheduler for K-LMS  
lms = LMSDiscreteScheduler(
    beta_start=0.00085, 
    beta_end=0.012, 
    beta_schedule="scaled_linear"
)

guidance_scale=8.5
seed=777
steps=50

cartoon_model_path = "Andyrasika/lora_diffusion"
controlnet = ControlNetModel.from_pretrained("lllyasviel/sd-controlnet-canny", torch_dtype=torch.float16)
cartoon_pipe = StableDiffusionPipeline.from_pretrained(cartoon_model_path,
                                                       controlnet=controlnet,
                                                       scheduler=lms,
                                                       torch_dtype=torch.float16)
cartoon_pipe.to("cuda")

def generate(prompt, file_prefix ,samples):
    torch.manual_seed(seed)
    prompt += ", Very detailed, clean, high quality, sharp image"
    cartoon_images = cartoon_pipe([prompt] * samples, num_inference_steps=steps, guidance_scale=guidance_scale)["images"]
    for idx, image in enumerate(cartoon_images):
        image.save(f"{file_prefix}-{idx}-{seed}-sd2-simpsons-blip.jpg")

generate("An oil painting of Snoop Dogg as a simpsons character", "01_SnoopDog", 4)
generate("Gal Gadot, cartoon", "02_GalGadot", 4)
generate("A cartoony Simpsons town", "03_SimpsonsTown", 4)
generate("Pikachu with the Simpsons, Eric Wallis", "04_PikachuSimpsons", 4)
sudip550 commented 1 year ago
!pip install diffusers==0.18.2
!pip install scipy ftfy accelerate
!pip install git+https://github.com/huggingface/transformers
from diffusers import  LMSDiscreteScheduler,DiffusionPipeline,StableDiffusionPipeline,ControlNetModel
import torch
import transformers
from transformers import AutoProcessor, AdamW
from transformers import BlipForConditionalGeneration

from huggingface_hub import notebook_login

notebook_login()

# this will substitute the default PNDM scheduler for K-LMS  
lms = LMSDiscreteScheduler(
    beta_start=0.00085, 
    beta_end=0.012, 
    beta_schedule="scaled_linear"
)

guidance_scale=8.5
seed=777
steps=50

cartoon_model_path = "Andyrasika/lora_diffusion"
controlnet = ControlNetModel.from_pretrained("lllyasviel/sd-controlnet-canny", torch_dtype=torch.float16)
cartoon_pipe = StableDiffusionPipeline.from_pretrained(cartoon_model_path,
                                                       controlnet=controlnet,
                                                       scheduler=lms,
                                                       torch_dtype=torch.float16)
cartoon_pipe.to("cuda")

def generate(prompt, file_prefix ,samples):
    torch.manual_seed(seed)
    prompt += ", Very detailed, clean, high quality, sharp image"
    cartoon_images = cartoon_pipe([prompt] * samples, num_inference_steps=steps, guidance_scale=guidance_scale)["images"]
    for idx, image in enumerate(cartoon_images):
        image.save(f"{file_prefix}-{idx}-{seed}-sd2-simpsons-blip.jpg")

generate("An oil painting of Snoop Dogg as a simpsons character", "01_SnoopDog", 4)
generate("Gal Gadot, cartoon", "02_GalGadot", 4)
generate("A cartoony Simpsons town", "03_SimpsonsTown", 4)
generate("Pikachu with the Simpsons, Eric Wallis", "04_PikachuSimpsons", 4)

hiii sir, can you please give code for StableDiffusionControlNetReferencePipeline to use it in colab...

johndpope commented 8 months ago

https://github.com/search?q=StableDiffusionControlNetReferencePipeline&type=code

wjkoh commented 6 months ago

Is there a version of StableDiffusionControlNetReferencePipeline for SDXL as well? I was able to find only StableDiffusionXLReferencePipeline, no StableDiffusionXLControlNetReferencePipeline.

innat-asj commented 4 months ago

@vvlookman Did it work StableDiffusionControlNetReferencePipeline? I've just tried for a case, results are horrible.

@sayakpaul Would it be possible to add this into core library? (Additionally with inpainting APIs)? similar: https://github.com/huggingface/diffusers/issues/5927