huggingface / diffusers

🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch and FLAX.

https://huggingface.co/docs/diffusers

Apache License 2.0

26.4k stars 5.43k forks source link

Generation using StableDiffusionPipeline with torch_dtype=torch.float16 and mps crashes the kernel on Mac M1 #4480

Closed jorgen-h closed 3 days ago

jorgen-h commented 1 year ago

Describe the bug

If I add torch_dtype=torch.float16 to any model the Python kernel stops/crashes when trying to generate images, works well when I don't add that setting. I have a Mac M1 so I run with the .to("mps").

Reproduction

Recreate bug code

from diffusers import StableDiffusionPipeline import torch

MODEL_VERSION = "runwayml/stable-diffusion-v1-5"

pytorch_pipe = StableDiffusionPipeline.from_pretrained(MODEL_VERSION, torch_dtype=torch.float16 # Remove this line and it works ).to("mps")

image = pytorch_pipe( prompt="photo of a man standing next to a wall", width=512, height=512, num_inference_steps=50, num_images_per_prompt=1, guidance_scale=7 )

Logs

No response

System Info

diffusers version: 0.20.0.dev0 (installed diffusers latest dev version to see if it was working there, but issue is also present in latest official release)
Mac M1 Pro with 16Gb memory
Platform: macOS-13.5-arm64-arm-64bit
Python version: 3.11.4
PyTorch version (GPU?): 2.0.1 (False)
Huggingface_hub version: 0.16.4
Transformers version: 4.31.0
Accelerate version: 0.21.0
xFormers version: not installed
Using GPU in script?: using .to("mps")
Using distributed or parallel set-up in script?:

Who can help?

No response

sayakpaul commented 1 year ago

What is the error trace? Could this be because MPS doesn't have the implementation in float16?

Cc: @pcuenca

adi-lb-phoenix commented 1 year ago

https://huggingface.co/docs/diffusers/optimization/mps https://github.com/pytorch/pytorch/issues/84039 https://github.com/pytorch/pytorch/issues/84039

@sayakpaul according to the docs and few open issues mps seems to be the issue . @jorgen-h pytorch_pipe = StableDiffusionPipeline.from_pretrained(MODEL_VERSION, torch_dtype=torch.float32 # Remove this line and it works ).to("cpu") this has worked .

github-actions[bot] commented 1 year ago

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

patrickvonplaten commented 1 year ago

Gently pinging @pcuenca again

jorgen-h commented 1 year ago

This (using fp16 with mps) seems to work now with the nightly-build of PyTorch. Don't know from which version it started to work though.

pcuenca commented 1 year ago

Thanks @jorgen-h, good to know! Did you test PyTorch 2.1.0 or just the nightly?

jorgen-h commented 1 year ago

It seems to work both with PyTorch 2.1.0 and nightly.

However, if you set torch_dtype=torch.float16 in your Pipeline and also do pipeline.enable_attention_slicing() then you get a pure black image as output.

Works well if you only apply one of the above settings.

github-actions[bot] commented 1 year ago

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

SpiraMira commented 11 months ago

Hi @pcuenca - any news on this? running a diffusion pipeline in fp16 still generates black images for me

from diffusers import DiffusionPipeline
import torch

pipeline = DiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5", torch_dtype=torch.float16, variant="fp16", use_safetensors=True).to("mps")
pipeline.enable_attention_slicing()

the fp16 pipelines only work when attention_slicing is disabled (or not enabled). note: I haven't noticed a big difference in inference performance or memory pressure when it is enabled (without float16), but, while not critical, I'm striving to create a stable M1Pro platform that run most of Jeff Howard's notebooks out of the box on apple notebooks.

Any insight is deeply appreciated, Patrice.

System Info

Mac M1 Pro with 16Gb memory Platform: macOS-14.1.2 (latest) (running environment) Jupyter notebook and lab (latest from scratch a week ago with the traitlets set to 5.0.9 latest conda ( from scratch a week ago) latest fastai etc... diffusers version: 0.24.0 Pytorch : 2.1.1 Python version: 3.11.6 Huggingface_hub version: 0.16.4 Transformers version: 4.33.2 Accelerate version: 0.25.0

Sanster commented 10 months ago

Hi @pcuenca - any news on this? running a diffusion pipeline in fp16 still generates black images for me
from diffusers import DiffusionPipeline
import torch

pipeline = DiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5", torch_dtype=torch.float16, variant="fp16", use_safetensors=True).to("mps")
pipeline.enable_attention_slicing()
the fp16 pipelines only work when attention_slicing is disabled (or not enabled). note: I haven't noticed a big difference in inference performance or memory pressure when it is enabled (without float16), but, while not critical, I'm striving to create a stable M1Pro platform that run most of Jeff Howard's notebooks out of the box on apple notebooks.

Any insight is deeply appreciated, Patrice.

System Info

Mac M1 Pro with 16Gb memory Platform: macOS-14.1.2 (latest) (running environment) Jupyter notebook and lab (latest from scratch a week ago with the traitlets set to 5.0.9 latest conda ( from scratch a week ago) latest fastai etc... diffusers version: 0.24.0 Pytorch : 2.1.1 Python version: 3.11.6 Huggingface_hub version: 0.16.4 Transformers version: 4.33.2 Accelerate version: 0.25.0

Can also be reproduced in my environment. Max M2 Max 64GB

torch: 2.1.0
diffusers: 0.25.0
transformers: 4.36.2
accelerate: 0.24.1

Not only the runwayml/stable-diffusion-v1-5 model, but also the timbrooks/instruct-pix2pix and diffusers/stable-diffusion-xl-1.0-inpainting-0.1 models...

patrickvonplaten commented 10 months ago

@pcuenca if you have a couple minutes - this seems like a highly requested feature

pcuenca commented 10 months ago

I'll test again.

github-actions[bot] commented 9 months ago

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

yiyixuxu commented 9 months ago

gentle pin @pcuenca

github-actions[bot] commented 9 months ago

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

sayakpaul commented 9 months ago

Gentle ping @pcuenca

bghira commented 8 months ago

i believe this may be addresseed for SDXL pipelines in #7447 though i never directly ran into the issue during inference, i suppose it's possible. for me, it impacted training.

github-actions[bot] commented 7 months ago

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

sayakpaul commented 1 week ago

Is this still a problem?