energy-based-model / Compositional-Visual-Generation-with-Composable-Diffusion-Models-PyTorch

[ECCV 2022] Compositional Generation using Diffusion Models
https://energy-based-model.github.io/Compositional-Visual-Generation-with-Composable-Diffusion-Models/
Other
456 stars 41 forks source link

Stable Diffusion uses standard pipeline, where is composition? #11

Closed ghost closed 2 years ago

ghost commented 2 years ago

https://huggingface.co/spaces/Shuang59/Composable-Diffusion/blob/main/app.py

With Stable Diffusion there appears to be no composition happening? The code just uses the standard pipeline from diffusers.

from diffusers import StableDiffusionPipeline

pipe = StableDiffusionPipeline.from_pretrained(
    "CompVis/stable-diffusion-v1-4",
    use_auth_token=st.secrets["USER_TOKEN"]
).to(device)

Also, what happened to the NOT operator in the paper? It appears absent.

nanlliu commented 2 years ago

here we pip install our modified diffusers package for compositionality. (see https://github.com/energy-based-model/Compositional-Visual-Generation-with-Composable-Diffusion-Models-PyTorch/blob/main/requirements.txt)

see modification details at https://github.com/nanlliu/diffusers/blob/main/src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion.py#L143-L146.

We split the string at https://github.com/nanlliu/diffusers/blob/main/src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion.py#L74

hope that helps.

For NOT operator, you need to negate the score on which you condition. For simplicity, we just have AND operator for now.

ghost commented 2 years ago

I see now, it would have been better to just create your own pipeline using the original diffusers.