I noticed that ControlNet is very reactive to the condition strength, that is, from a relative low strength it maintains the shape.
The next figure shows the depth map controlnet of this scene, ranging from 0.1 strength to 1.0 strength, naturally keeping all other params the same (prompt: a man and woman sitting on a couch with party hats on).
On the contrary, T2I Adapter is more relaxed, and it even requires high strength to get the input condition.
In some cases I have seen T2IAdapter to work better, and the other way around as well. Pretty sure quite a few people can relate.
So, I just wanted to experiment what if I included both? I created a new StableDiffusionXLControlNetAdapterPipeline for this purpose, and because the name convention for the UNet changed recently, both T2IAdapter and ControlNet features can be inserted with minor changes.
The next figure shows some results combining both of them.
Left: controlnet strength 0.0 - adapter strength 1.0 (same as StableDiffusionXLAdapterPipeline)
Right: controlnet 1.0 - adapter 0.0. (same as StableDiffusionXLControlNetPipeline
Images in between are interpolations of these values (0.1 - 0.9, 0.2 - 0.8, etc). I just wanted to try strengths that combined were 1.0, but the exploration of different values can be huge here. And all of that with a single file.
I created this issue to see how interesting it is for the core pipeline? I could contribute with a PR
For these images I used stabilityai/stable-diffusion-xl-base-1.0 as base model, and diffusers/controlnet-depth-sdxl-1.0 and TencentARC/t2i-adapter-depth-midas-sdxl-1.0 for controlnet and t2iadapter, respectively.
I noticed that ControlNet is very reactive to the condition strength, that is, from a relative low strength it maintains the shape. The next figure shows the depth map controlnet of this scene, ranging from 0.1 strength to 1.0 strength, naturally keeping all other params the same (prompt: a man and woman sitting on a couch with party hats on).
On the contrary, T2I Adapter is more relaxed, and it even requires high strength to get the input condition.
In some cases I have seen T2IAdapter to work better, and the other way around as well. Pretty sure quite a few people can relate.
So, I just wanted to experiment what if I included both? I created a new
StableDiffusionXLControlNetAdapterPipeline
for this purpose, and because the name convention for the UNet changed recently, both T2IAdapter and ControlNet features can be inserted with minor changes. The next figure shows some results combining both of them. Left: controlnet strength 0.0 - adapter strength 1.0 (same asStableDiffusionXLAdapterPipeline
) Right: controlnet 1.0 - adapter 0.0. (same asStableDiffusionXLControlNetPipeline
Images in between are interpolations of these values (0.1 - 0.9, 0.2 - 0.8, etc). I just wanted to try strengths that combined were 1.0, but the exploration of different values can be huge here. And all of that with a single file.I created this issue to see how interesting it is for the core pipeline? I could contribute with a PR
For these images I used
stabilityai/stable-diffusion-xl-base-1.0
as base model, anddiffusers/controlnet-depth-sdxl-1.0
andTencentARC/t2i-adapter-depth-midas-sdxl-1.0
for controlnet and t2iadapter, respectively.