huggingface / optimum-neuron

Easy, fast and very cheap training and inference on AWS Trainium and Inferentia chips.
Apache License 2.0
176 stars 51 forks source link

Add Stable Diffusion ControlNet support #622

Closed JingyaHuang closed 3 days ago

JingyaHuang commented 3 weeks ago

What does this PR do?

Fixes #575

from optimum.neuron import NeuronStableDiffusionControlNetPipeline

model_id = "runwayml/stable-diffusion-v1-5"
controlnet_id = "lllyasviel/sd-controlnet-canny"
save_directory = "sd_neuron_controlnet"

# [Neuron] pipeline
input_shapes = {"batch_size": 1, "height": 512, "width": 512, "num_images_per_prompt": 1}
compiler_args = {"auto_cast": "matmul", "auto_cast_type": "bf16"}
pipe = NeuronStableDiffusionControlNetPipeline.from_pretrained(
    model_id,
    controlnet_ids=controlnet_id,
    export=True,
    **input_shapes,
    **compiler_args,
)
pipe.save_pretrained(save_directory)
import cv2
import numpy as np
from diffusers import UniPCMultistepScheduler
from diffusers.utils import load_image, make_image_grid
from PIL import Image

from optimum.neuron import NeuronStableDiffusionControlNetPipeline

# prepare canny image
original_image = load_image(
    "https://hf.co/datasets/huggingface/documentation-images/resolve/main/diffusers/input_image_vermeer.png"
)

image = np.array(original_image)

low_threshold = 100
high_threshold = 200

image = cv2.Canny(image, low_threshold, high_threshold)
image = image[:, :, None]
image = np.concatenate([image, image, image], axis=2)
canny_image = Image.fromarray(image)

# [Neuron] pipeline
save_directory = "sd_neuron_controlnet"
pipe = NeuronStableDiffusionControlNetPipeline.from_pretrained(save_directory)

pipe.scheduler = UniPCMultistepScheduler.from_config(pipe.scheduler.config)
output = pipe("the mona lisa", image=canny_image).images[0]
compare = make_image_grid([original_image, canny_image, output], rows=1, cols=3)
compare.save("compare.png")

Next Steps

Before submitting

HuggingFaceDocBuilderDev commented 2 weeks ago

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Suprhimp commented 2 weeks ago

what happen? I hope it will be merge it soon, aslo sdxl control model too :)

dacorvo commented 5 days ago

If you push your pull-request back, consider cherry-picking this commit from my branch to fix the TGI docker build.

JingyaHuang commented 5 days ago

@dacorvo for the tracing, the compiler only accepts tensors, but not a list or a tuple of tensors which could be the case in transformers. So we flatten (actually we create directly non-nested dummy) inputs during the tracing, and during the inference runtime, we need to flatten inputs generated by the preprocessor (or the output of another model in the pipe like the case of stable diffusion) before feeding it into the compiled model.