huggingface / diffusers

🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch and FLAX.
https://huggingface.co/docs/diffusers
Apache License 2.0
25.98k stars 5.35k forks source link

SDXL unconditional generation is broken #9680

Closed morrisalp closed 3 days ago

morrisalp commented 2 weeks ago

Describe the bug

Running SDXL generation with CFG scale 1.0 and 0.0 give the exact same results, but CFG scale 0.0 should perform unconditional generation.

Reproduction

from diffusers import StableDiffusionXLPipeline, StableDiffusionXLImg2ImgPipeline import torch

pipe = StableDiffusionXLPipeline.from_pretrained( "stabilityai/stable-diffusion-xl-base-1.0", torch_dtype=torch.float16, variant="fp16", use_safetensors=True ).to("cuda")

prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k"

img1 = pipe(prompt=prompt, guidance_scale=1.0, generator = torch.Generator().manual_seed(0)).images[0] img2 = pipe(prompt=prompt, guidance_scale=0.0, generator = torch.Generator().manual_seed(0)).images[0]

Logs

No response

System Info

Copy-and-paste the text below in your GitHub issue and FILL OUT the two last points.

Who can help?

@yiyixuxu @sayakpaul @DN6

morrisalp commented 2 weeks ago

Note that I think this is due to the property StableDiffusionXLPipeline.do_classifier_free_guidance using the logic self._guidance_scale > 1, while CFG scale <1 (including unconditional generation) is useful for some applications.

a-r-r-o-w commented 2 weeks ago

This is indeed the case. You've already listed the case for performing conditional guided (guidance_scale > 1.0) and conditional non-guided generation (guidance_scale <= 1.0). For unconditional generation, you could do so by passing an empty prompt instead and setting guidance scale <= 1.0, or pass torch.zeros(...) of the right shape to prompt_embeds (I think this is specific to certain models and torch.zeros may not produce coherent results always so empty prompt is a better choice).

a-r-r-o-w commented 3 days ago

Marking this as closed due to explanation above and inactivity. Feel free to re-open though, if there's anything else we can help with