Multidiffusion (panorama pipeline) is missing segmentation inputs?

huggingface / diffusers

🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch and FLAX.

https://huggingface.co/docs/diffusers

Apache License 2.0

26.42k stars 5.44k forks source link

Multidiffusion (panorama pipeline) is missing segmentation inputs? #9802

Open jloveric opened 1 month ago

jloveric commented 1 month ago

I'm looking at the multidiffusion panorama pipeline page (https://huggingface.co/docs/diffusers/en/api/pipelines/panorama). It looks like there is no way to specify the segmentation and associated prompts as in the original paper https://multidiffusion.github.io/ . If the code only has the panorama capability and not the region based generation using segmentation and prompts, then it should be extended to include the regional generation... If it does have region based generation then the documentation should be updated to show how to use it!

jloveric commented 1 month ago

It looks like it needs to be a change to the controlnet pipeline (or a new one) https://github.com/omerbt/MultiDiffusion/blob/master/pipeline_controlnet.py

sayakpaul commented 1 month ago

Thanks for bringing this up. When multi-diffusion was added to the library it didn't likely offer the region-based control so, it wasn't added. Currently, we don't have the bandwidth to take this up. But if you want to contribute this feature, we're happy to guide you.

jloveric commented 1 month ago

Could be the masked version doesn't work that great with region based control masks. One paper that does some comparisons of a few techniques with very explicit masks (instead of regions) https://arxiv.org/pdf/2406.04032

github-actions[bot] commented 2 days ago

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.