Closed anotherjesse closed 1 year ago
Hey @anotherjesse,
Thanks for the issue could you check the model cards here: https://huggingface.co/lllyasviel/sd-controlnet-seg#released-checkpoints for each controlnet there should be a working example. If something doesn't work - I'm happy to help :-)
@patrickvonplaten for the heads up - I didn't notice that there were samples towards the bottom of each model.
My question is - if this repository is to process/prep things for controlnet, could or should Midas/Canny/others be returning images you can use in the pipelines directly?
In the code snippet for: https://huggingface.co/lllyasviel/sd-controlnet-depth
depth_estimator = pipeline('depth-estimation')
image = load_image("https://huggingface.co/lllyasviel/sd-controlnet-depth/resolve/main/images/stormtrooper.png")
image = depth_estimator(image)['depth']
image = np.array(image)
image = image[:, :, None]
image = np.concatenate([image, image, image], axis=2)
image = Image.fromarray(image)
controlnet = ControlNetModel.from_pretrained(
"fusing/stable-diffusion-v1-5-controlnet-depth", torch_dtype=torch.float16
)
pipe = StableDiffusionControlNetPipeline.from_pretrained(
"runwayml/stable-diffusion-v1-5", controlnet=controlnet, safety_checker=None, torch_dtype=torch.float16
)
pipe.scheduler = UniPCMultistepScheduler.from_config(pipe.scheduler.config)
# Remove if you do not have xformers installed
# see https://huggingface.co/docs/diffusers/v0.13.0/en/optimization/xformers#installing-xformers
# for installation instructions
pipe.enable_xformers_memory_efficient_attention()
pipe.enable_model_cpu_offload()
image = pipe("Stormtrooper's lecture", image, num_inference_steps=20).images[0]
image.save('./images/stormtrooper_depth_out.png')
It seems like controlnet_aux's midas module should return a depth image ready for controlnet pipeline, instead of requiring the user to add these numpy lines between controlnet_aux
and the instance of StableDiffusionControlNetPipeline
.
image = np.array(depth_image)
image = image[:, :, None]
image = np.concatenate([image, image, image], axis=2)
depth_image = Image.fromarray(image)
Hmm, looking at more of the repository, perhaps that is what controlnet_aux.util.HWC3
is there for?
The way to use controlnet_aux + depth would be:
image = load_image("control.png")
depth_image, normal_image = midas(image)
image = HWC3(depth_image)
output = pipe(prompt, image, ...)
If so, I'll make a PR to add a note to the README :)
I'm coming at this repository as a way to use StableDiffusionControlNetPipeline
s.
In that context it seems like the detectors output should work in the pipelines without further bit fiddling.
This might be the wrong perspective - and there are good reasons to return it the way it is...
When I try to use the Midas depth image:
I get an error:
Perhaps I'm doing something wrong?
To work around I've copied this snippet from the original controlnet:
I have to do the same thing for CannyDetector