huggingface / diffusers

🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch and FLAX.
https://huggingface.co/docs/diffusers
Apache License 2.0
26.02k stars 5.35k forks source link

Multidiffusion (panorama pipeline) is missing segmentation inputs? #9802

Open jloveric opened 6 days ago

jloveric commented 6 days ago

I'm looking at the multidiffusion panorama pipeline page (https://huggingface.co/docs/diffusers/en/api/pipelines/panorama). It looks like there is no way to specify the segmentation and associated prompts as in the original paper https://multidiffusion.github.io/ . If the code only has the panorama capability and not the region based generation using segmentation and prompts, then it should be extended to include the regional generation... If it does have region based generation then the documentation should be updated to show how to use it!

jloveric commented 6 days ago

It looks like it needs to be a change to the controlnet pipeline (or a new one) https://github.com/omerbt/MultiDiffusion/blob/master/pipeline_controlnet.py

sayakpaul commented 3 days ago

Thanks for bringing this up. When multi-diffusion was added to the library it didn't likely offer the region-based control so, it wasn't added. Currently, we don't have the bandwidth to take this up. But if you want to contribute this feature, we're happy to guide you.

jloveric commented 3 days ago

Could be the masked version doesn't work that great with region based control masks. One paper that does some comparisons of a few techniques with very explicit masks (instead of regions) https://arxiv.org/pdf/2406.04032