UserWarning: MPS: 'nearest' mode upsampling is supported natively starting from macOS 13.0.

Your question

We're running on probably unsupported hardware. AMD RX 6900 XT on a Mac 5,1 with macOS 12. We got errors launching running without any parameters: 'RuntimeError: "upsample_nearest2d_channels_last" not implemented for 'Half'' but can get around it if we started with --force-fp32.

The issue however is when rendering we end up with a glitched image result. Screenshot 2024-07-27 at 16 39 10

While generating this we see in the logs: /Users/rmp/projects/ComfyUI/venv/lib/python3.8/site-packages/torch/nn/functional.py:4001: UserWarning: MPS: 'nearest' mode upsampling is supported natively starting from macOS 13.0. Falling back on CPU. This may have performance implications. (Triggered internally at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/native/mps/operations/UpSample.mm:255.) return torch._C._nn.upsample_nearest2d(input, output_size, scale_factors) 100%| Requested to load AutoencoderKL Loading 1 new model /Users/rmp/projects/ComfyUI/venv/lib/python3.8/site-packages/torch/nn/functional.py:4001: UserWarning: MPS: passing scale factor to upsample ops is supported natively starting from macOS 13.0. Falling back on CPU. This may have performance implications. (Triggered internally at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/native/mps/operations/UpSample.mm:246.) return torch._C._nn.upsample_nearest2d(input, output_size, scale_factors) Prompt executed in 31.60 seconds

Is this error what would produce the image result or should it still work with the fallback to CPU?

We have been using Automatic 1111 running without problems utilising the GPU in this environment and would love to move to Comfy. Any ideas or pointers greatly appreciated

Logs

No response

Other

No response

Just pulled the latest to see if this has been fixed and the same issue persists. One new thing we tried since we are using an RDNA2 amd card is this startup option:

HSA_OVERRIDE_GFX_VERSION=10.3.0 python main.py --force-fp32

Again, able to generate an image but the result of the default loaded setup is this:

Log output:

Total VRAM 114688 MB, total RAM 114688 MB
pytorch version: 2.2.2
Forcing FP32, if this improves things please report it.
Set vram state to: SHARED
Device: mps
Using sub quadratic optimization for cross attention, if you have memory or speed issues try using: --use-split-cross-attention
[Prompt Server] web root: /Users/rmp/projects/ComfyUI/web
Adding extra search path checkpoints /Users/rmp/stable-diffusion-webui/models/Stable-diffusion
Adding extra search path configs /Users/rmp/stable-diffusion-webui/models/Stable-diffusion
Adding extra search path vae /Users/rmp/stable-diffusion-webui/models/VAE
Adding extra search path loras /Users/rmp/stable-diffusion-webui/models/Lora
Adding extra search path loras /Users/rmp/stable-diffusion-webui/models/LyCORIS
Adding extra search path upscale_models /Users/rmp/stable-diffusion-webui/models/ESRGAN
Adding extra search path upscale_models /Users/rmp/stable-diffusion-webui/models/RealESRGAN
Adding extra search path upscale_models /Users/rmp/stable-diffusion-webui/models/SwinIR
Adding extra search path embeddings /Users/rmp/stable-diffusion-webui/embeddings
Adding extra search path hypernetworks /Users/rmp/stable-diffusion-webui/models/hypernetworks
Adding extra search path controlnet /Users/rmp/stable-diffusion-webui/models/ControlNet

Import times for custom nodes:
   0.0 seconds: /Users/rmp/projects/ComfyUI/custom_nodes/websocket_image_save.py

Starting server

To see the GUI go to: http://127.0.0.1:8188
got prompt
model weight dtype torch.float32, manual cast: None
model_type EPS
Using split attention in VAE
Using split attention in VAE
/Users/rmp/.pyenv/versions/3.8.10/lib/python3.8/site-packages/transformers/tokenization_utils_base.py:1601: FutureWarning: `clean_up_tokenization_spaces` was not set. It will be set to `True` by default. This behavior will be depracted in transformers v4.45, and will be then set to `False` by default. For more details check this issue: https://github.com/huggingface/transformers/issues/31884
  warnings.warn(
Requested to load SD1ClipModel
Loading 1 new model
loaded completely 0.0 235.84423828125 True
Requested to load BaseModel
Loading 1 new model
loaded completely 0.0 3278.812271118164 True
  0%|                                                                                                                                                                        | 0/20 [00:00<?, ?it/s]/Users/rmp/.pyenv/versions/3.8.10/lib/python3.8/site-packages/torch/nn/functional.py:4001: UserWarning: MPS: 'nearest' mode upsampling is supported natively starting from macOS 13.0. Falling back on CPU. This may have performance implications. (Triggered internally at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/native/mps/operations/UpSample.mm:255.)
  return torch._C._nn.upsample_nearest2d(input, output_size, scale_factors)
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 20/20 [00:14<00:00,  1.36it/s]
Requested to load AutoencoderKL
Loading 1 new model
loaded completely 0.0 319.11416244506836 True
/Users/rmp/.pyenv/versions/3.8.10/lib/python3.8/site-packages/torch/nn/functional.py:4001: UserWarning: MPS: passing scale factor to upsample ops is supported natively starting from macOS 13.0. Falling back on CPU. This may have performance implications. (Triggered internally at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/native/mps/operations/UpSample.mm:246.)
  return torch._C._nn.upsample_nearest2d(input, output_size, scale_factors)
Prompt executed in 27.82 seconds

comfyanonymous / ComfyUI