Open birchcode opened 1 month ago
This issue is being marked stale because it has not had any activity for 30 days. Reply below within 7 days if your issue still isn't solved, and it will be left open. Otherwise, the issue will be closed automatically.
Just pulled the latest to see if this has been fixed and the same issue persists. One new thing we tried since we are using an RDNA2 amd card is this startup option:
HSA_OVERRIDE_GFX_VERSION=10.3.0 python main.py --force-fp32
Again, able to generate an image but the result of the default loaded setup is this:
Log output:
Total VRAM 114688 MB, total RAM 114688 MB
pytorch version: 2.2.2
Forcing FP32, if this improves things please report it.
Set vram state to: SHARED
Device: mps
Using sub quadratic optimization for cross attention, if you have memory or speed issues try using: --use-split-cross-attention
[Prompt Server] web root: /Users/rmp/projects/ComfyUI/web
Adding extra search path checkpoints /Users/rmp/stable-diffusion-webui/models/Stable-diffusion
Adding extra search path configs /Users/rmp/stable-diffusion-webui/models/Stable-diffusion
Adding extra search path vae /Users/rmp/stable-diffusion-webui/models/VAE
Adding extra search path loras /Users/rmp/stable-diffusion-webui/models/Lora
Adding extra search path loras /Users/rmp/stable-diffusion-webui/models/LyCORIS
Adding extra search path upscale_models /Users/rmp/stable-diffusion-webui/models/ESRGAN
Adding extra search path upscale_models /Users/rmp/stable-diffusion-webui/models/RealESRGAN
Adding extra search path upscale_models /Users/rmp/stable-diffusion-webui/models/SwinIR
Adding extra search path embeddings /Users/rmp/stable-diffusion-webui/embeddings
Adding extra search path hypernetworks /Users/rmp/stable-diffusion-webui/models/hypernetworks
Adding extra search path controlnet /Users/rmp/stable-diffusion-webui/models/ControlNet
Import times for custom nodes:
0.0 seconds: /Users/rmp/projects/ComfyUI/custom_nodes/websocket_image_save.py
Starting server
To see the GUI go to: http://127.0.0.1:8188
got prompt
model weight dtype torch.float32, manual cast: None
model_type EPS
Using split attention in VAE
Using split attention in VAE
/Users/rmp/.pyenv/versions/3.8.10/lib/python3.8/site-packages/transformers/tokenization_utils_base.py:1601: FutureWarning: `clean_up_tokenization_spaces` was not set. It will be set to `True` by default. This behavior will be depracted in transformers v4.45, and will be then set to `False` by default. For more details check this issue: https://github.com/huggingface/transformers/issues/31884
warnings.warn(
Requested to load SD1ClipModel
Loading 1 new model
loaded completely 0.0 235.84423828125 True
Requested to load BaseModel
Loading 1 new model
loaded completely 0.0 3278.812271118164 True
0%| | 0/20 [00:00<?, ?it/s]/Users/rmp/.pyenv/versions/3.8.10/lib/python3.8/site-packages/torch/nn/functional.py:4001: UserWarning: MPS: 'nearest' mode upsampling is supported natively starting from macOS 13.0. Falling back on CPU. This may have performance implications. (Triggered internally at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/native/mps/operations/UpSample.mm:255.)
return torch._C._nn.upsample_nearest2d(input, output_size, scale_factors)
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 20/20 [00:14<00:00, 1.36it/s]
Requested to load AutoencoderKL
Loading 1 new model
loaded completely 0.0 319.11416244506836 True
/Users/rmp/.pyenv/versions/3.8.10/lib/python3.8/site-packages/torch/nn/functional.py:4001: UserWarning: MPS: passing scale factor to upsample ops is supported natively starting from macOS 13.0. Falling back on CPU. This may have performance implications. (Triggered internally at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/native/mps/operations/UpSample.mm:246.)
return torch._C._nn.upsample_nearest2d(input, output_size, scale_factors)
Prompt executed in 27.82 seconds
Your question
We're running on probably unsupported hardware. AMD RX 6900 XT on a Mac 5,1 with macOS 12. We got errors launching running without any parameters: 'RuntimeError: "upsample_nearest2d_channels_last" not implemented for 'Half'' but can get around it if we started with --force-fp32.
The issue however is when rendering we end up with a glitched image result.
While generating this we see in the logs: /Users/rmp/projects/ComfyUI/venv/lib/python3.8/site-packages/torch/nn/functional.py:4001: UserWarning: MPS: 'nearest' mode upsampling is supported natively starting from macOS 13.0. Falling back on CPU. This may have performance implications. (Triggered internally at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/native/mps/operations/UpSample.mm:255.) return torch._C._nn.upsample_nearest2d(input, output_size, scale_factors) 100%| Requested to load AutoencoderKL Loading 1 new model /Users/rmp/projects/ComfyUI/venv/lib/python3.8/site-packages/torch/nn/functional.py:4001: UserWarning: MPS: passing scale factor to upsample ops is supported natively starting from macOS 13.0. Falling back on CPU. This may have performance implications. (Triggered internally at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/native/mps/operations/UpSample.mm:246.) return torch._C._nn.upsample_nearest2d(input, output_size, scale_factors) Prompt executed in 31.60 seconds
Is this error what would produce the image result or should it still work with the fallback to CPU?
We have been using Automatic 1111 running without problems utilising the GPU in this environment and would love to move to Comfy. Any ideas or pointers greatly appreciated
Logs
No response
Other
No response