Closed jorgen-h closed 3 days ago
What is the error trace? Could this be because MPS doesn't have the implementation in float16?
Cc: @pcuenca
https://huggingface.co/docs/diffusers/optimization/mps https://github.com/pytorch/pytorch/issues/84039 https://github.com/pytorch/pytorch/issues/84039
@sayakpaul according to the docs and few open issues mps seems to be the issue . @jorgen-h pytorch_pipe = StableDiffusionPipeline.from_pretrained(MODEL_VERSION, torch_dtype=torch.float32 # Remove this line and it works ).to("cpu") this has worked .
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.
Gently pinging @pcuenca again
This (using fp16 with mps) seems to work now with the nightly-build of PyTorch. Don't know from which version it started to work though.
Thanks @jorgen-h, good to know! Did you test PyTorch 2.1.0 or just the nightly?
It seems to work both with PyTorch 2.1.0 and nightly.
However, if you set torch_dtype=torch.float16 in your Pipeline and also do pipeline.enable_attention_slicing() then you get a pure black image as output.
Works well if you only apply one of the above settings.
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.
Hi @pcuenca - any news on this? running a diffusion pipeline in fp16 still generates black images for me
from diffusers import DiffusionPipeline
import torch
pipeline = DiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5", torch_dtype=torch.float16, variant="fp16", use_safetensors=True).to("mps")
pipeline.enable_attention_slicing()
the fp16 pipelines only work when attention_slicing is disabled (or not enabled). note: I haven't noticed a big difference in inference performance or memory pressure when it is enabled (without float16), but, while not critical, I'm striving to create a stable M1Pro platform that run most of Jeff Howard's notebooks out of the box on apple notebooks.
Any insight is deeply appreciated, Patrice.
System Info
Mac M1 Pro with 16Gb memory Platform: macOS-14.1.2 (latest) (running environment) Jupyter notebook and lab (latest from scratch a week ago with the traitlets set to 5.0.9 latest conda ( from scratch a week ago) latest fastai etc... diffusers version: 0.24.0 Pytorch : 2.1.1 Python version: 3.11.6 Huggingface_hub version: 0.16.4 Transformers version: 4.33.2 Accelerate version: 0.25.0
Hi @pcuenca - any news on this? running a diffusion pipeline in fp16 still generates black images for me
from diffusers import DiffusionPipeline import torch pipeline = DiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5", torch_dtype=torch.float16, variant="fp16", use_safetensors=True).to("mps") pipeline.enable_attention_slicing()
the fp16 pipelines only work when attention_slicing is disabled (or not enabled). note: I haven't noticed a big difference in inference performance or memory pressure when it is enabled (without float16), but, while not critical, I'm striving to create a stable M1Pro platform that run most of Jeff Howard's notebooks out of the box on apple notebooks.
Any insight is deeply appreciated, Patrice.
System Info
Mac M1 Pro with 16Gb memory Platform: macOS-14.1.2 (latest) (running environment) Jupyter notebook and lab (latest from scratch a week ago with the traitlets set to 5.0.9 latest conda ( from scratch a week ago) latest fastai etc... diffusers version: 0.24.0 Pytorch : 2.1.1 Python version: 3.11.6 Huggingface_hub version: 0.16.4 Transformers version: 4.33.2 Accelerate version: 0.25.0
Can also be reproduced in my environment. Max M2 Max 64GB
Not only the runwayml/stable-diffusion-v1-5 model, but also the timbrooks/instruct-pix2pix and diffusers/stable-diffusion-xl-1.0-inpainting-0.1 models...
@pcuenca if you have a couple minutes - this seems like a highly requested feature
I'll test again.
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.
gentle pin @pcuenca
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.
Gentle ping @pcuenca
i believe this may be addresseed for SDXL pipelines in #7447 though i never directly ran into the issue during inference, i suppose it's possible. for me, it impacted training.
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.
Is this still a problem?
Describe the bug
If I add torch_dtype=torch.float16 to any model the Python kernel stops/crashes when trying to generate images, works well when I don't add that setting. I have a Mac M1 so I run with the .to("mps").
Reproduction
Recreate bug code
from diffusers import StableDiffusionPipeline import torch
MODEL_VERSION = "runwayml/stable-diffusion-v1-5"
pytorch_pipe = StableDiffusionPipeline.from_pretrained(MODEL_VERSION, torch_dtype=torch.float16 # Remove this line and it works ).to("mps")
image = pytorch_pipe( prompt="photo of a man standing next to a wall", width=512, height=512, num_inference_steps=50, num_images_per_prompt=1, guidance_scale=7 )
Logs
No response
System Info
diffusers
version: 0.20.0.dev0 (installed diffusers latest dev version to see if it was working there, but issue is also present in latest official release)Mac M1 Pro with 16Gb memory
Platform: macOS-13.5-arm64-arm-64bit
Python version: 3.11.4
PyTorch version (GPU?): 2.0.1 (False)
Huggingface_hub version: 0.16.4
Transformers version: 4.31.0
Accelerate version: 0.21.0
xFormers version: not installed
Using GPU in script?: using .to("mps")
Using distributed or parallel set-up in script?:
Who can help?
No response