Slow Inference with Flux Model on ComfyUI using PyTorch 2.4.0

imxiaohan commented 2 months ago

Your question

Environment

ComfyUI: Latest version (installed via official website instructions)
GPU: NVIDIA RTX 4080 (12GB VRAM)
RAM: 32GB
CUDA: 12.1
PyTorch: 2.4.0 (issue occurs) / 2.3.1 (better performance)

Issue Description

I've recently installed the latest version of ComfyUI following the official website instructions. I'm experiencing significant slowdowns when using Flux models with PyTorch 2.4.0, but the performance improves dramatically when switching to PyTorch 2.3.1.

Observed Behavior

With PyTorch 2.4.0: Extremely slow inference (approximately 14-15 seconds per iteration)
With PyTorch 2.3.1: Much faster inference (approximately 1.5-2 seconds per iteration)

Expected Behavior

I would expect the Flux models to run at similar speeds across different PyTorch versions, or at least not have such a significant performance difference.

Additional Information

The installation process was smooth and followed the official guidelines.
No error messages are displayed; the issue is purely related to performance.
This slowdown is specific to Flux models when using PyTorch 2.4.0.
Switching to PyTorch 2.3.1 resolves the performance issue, but I'd prefer to use the latest PyTorch version if possible.

Questions

Is this a known issue with Flux models on ComfyUI when using PyTorch 2.4.0?
What could be causing such a significant performance difference between PyTorch 2.4.0 and 2.3.1 for Flux models?
Are there any workarounds or optimizations to improve Flux model performance with PyTorch 2.4.0?
Is this likely to be resolved in future updates, or should I continue using PyTorch 2.3.1 for optimal performance?

Any insights into this performance discrepancy or suggestions for using Flux models with the latest PyTorch version would be greatly appreciated. Thank you for your time and assistance.

Logs

No response

Other

No response

comfyanonymous commented 2 months ago

This is a windows specific issue and there's a reason why the main windows package ships with pytorch 2.3.1, there are a few issues with pytorch 2.4 on windows that kill performance.

If you want the latest pytorch you can try pytorch nightly, the issue might be fixed there.

ltdrdata commented 2 months ago

Check if shared memory is enabled in the NVIDIA Control Panel, and if it is, please turn it off.

Foul-Tarnished commented 2 months ago

WebUI-Forge doesn't get that slowdown with PyTorch2.4 Either it's using nightly, or it's another issue

I think it's your gpu getting out of vram and swapping. Disable "CUDA Sysmem fallback" in nvidia control panel

(btw rtx4080 is 16gb not 12gb)

LiJT commented 2 months ago

Check if shared memory is enabled in the NVIDIA Control Panel, and if it is, please turn it off.

Hello, I disabled GPU memory fallback, still the same slow loading problem. my error is as followed:

"E:\ComfyUI-aki-v1.3\comfy\ldm\modules\attention.py:407: UserWarning: 1Torch was not compiled with flash attention. (Triggered internally at C:\actions-runner_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\native\transformers\cuda\sdp_utils.cpp:555.) out = torch.nn.functional.scaled_dot_product_attention(q, k, v, attn_mask=mask, dropout_p=0.0, is_causal=False)"

More details here https://github.com/comfyanonymous/ComfyUI/issues/4663#issuecomment-2316694952

comfyanonymous / ComfyUI