dakenf / diffusers.js

diffusers implementation for node.js and browser
https://islamov.ai/diffusers.js/
316 stars 36 forks source link

Canny Edge Detection StableDiffusionControlNet and StableDiffusionControlNetImg2Img Pipelines #8

Closed jdp8 closed 11 months ago

jdp8 commented 11 months ago

Description

The changes include adding the StableDiffusionControlNet pipeline, specifically the Canny Edge Detection model along with the pre-processing function required to get the ControlNet input image which is done using OpenCV.js. The ControlNet pipeline is similar to the Image-To-Image pipeline since they both require an image as input. The main difference between them is that the Image-To-Image pipeline takes the input image with added noise and uses it as the input latent instead of random noise and the ControlNet pipeline takes the input image and other arguments as input to the ControlNet model which returns outputs (down block samples and middle block sample) that are used as input to the UNET model. The shape of the ControlNet inputs were taken from here and the shape of the UNET inputs were inferred based on the shapes and not the names.

In addition, the Image-To-Image feature was added to the StableDiffusionControlNet pipeline which resulted in the creation of the StableDiffusionControlNetImg2Img pipeline.

Convert Command

The command that I used to convert the model was: python conv_sd_to_onnx.py --model_path "runwayml/stable-diffusion-v1-5" --output_path "./model/sd1-5_fp16_cn_canny" --controlnet_path "lllyasviel/control_v11p_sd15_canny" --fp16 --attention-slicing auto from the converter that you use.

Specific Changes

Pre-Processing Libraries

In this section I'll include all of the pre-processing libraries that I found in case you want to use another one or add other ControlNet model:

Issues / Future Work