huggingface / diffusers

🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch and FLAX.
https://huggingface.co/docs/diffusers
Apache License 2.0
25.05k stars 5.18k forks source link

[Kolors] MPS always return black image #8970

Closed bqhuyy closed 1 month ago

bqhuyy commented 1 month ago

Describe the bug

Running Kolors with MPS, the returned result is always black image.

Reproduction

Run this source code. Output image returns black image

pipe = KolorsPipeline.from_pretrained("Kwai-Kolors/Kolors-diffusers", torch_dtype=torch.float16, variant="fp16")

if torch.cuda.is_available():
  pipe = pipe.to("cuda")
elif torch.backends.mps.is_available():
  pipe = pipe.to("mps")

pipe.scheduler = DPMSolverMultistepScheduler.from_config(pipe.scheduler.config, use_karras_sigmas=True)

# Recommended if your computer has < 64 GB of RAM
if torch.backends.mps.is_available():
  pipe.enable_attention_slicing()

prompt = '一张瓢虫的照片,微距,变焦,高质量,电影,拿着一个牌子,写着"可图"'

image = pipe(
    prompt=prompt,
    negative_prompt="",
    guidance_scale=6.5,
    num_inference_steps=25,
).images[0]
image.save("kolors.png")

Logs

No response

System Info

Who can help?

@asomoza @yiyixuxu

tolgacangoz commented 1 month ago

Did/Could you try with the latest stable and Nightly PyTorch versions?

bqhuyy commented 1 month ago

@tolgacangoz I'm using the latest one: pytorch 2.4.0 and torchvision 0.19.0

tolgacangoz commented 1 month ago

What happens with torch_dtype=torch.bfloat16 or torch_dtype=torch.float32? Do choosing torch_dtype=torch.float32 and/or removing pipe.enable_attention_slicing() exceed your memory?

asomoza commented 1 month ago

Sadly I can't test with a Mac right now, the pipeline is the same as SDXL, so probably what's generating the problem is the text encoder since it uses custom code.

Maybe give it a try with the original repo but my guess is that you will have the same problem, if it is, that's not something we can fix right now, and maybe you can ask in to the original authors or wait until it's properly integrated into transformers.

bqhuyy commented 1 month ago

@tolgacangoz It seems that the problem comes from pipe.enable_attention_slicing(). It works after removing it.