comfyanonymous / ComfyUI

The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.
https://www.comfy.org/
GNU General Public License v3.0
50.66k stars 5.32k forks source link

Flux Is very slow, and sometimes crashes in Comfy #4461

Open Michlozz opened 4 weeks ago

Michlozz commented 4 weeks ago

Your question

I got:

Total VRAM 8188 MB, total RAM 16011 MB pytorch version: 2.3.1+cu121 Set vram state to: NORMAL_VRAM Device: cuda:0 NVIDIA GeForce RTX 4060 Laptop GPU : cudaMallocAsync Using pytorch cross attention [Prompt Server] web root: C:\Users\ursus\Documents\stable-diffusion-webui\ComfyUI_windows_portable2\ComfyUI\web

Loading: ComfyUI-Manager (V2.50.1)

ComfyUI Revision: 2563 [2622c55a] | Released on '2024-08-18'

with schinell (4 steps) it takes sometimes more then 2 minutes per generation. with dev (20 steps) it can take more then 400 seconds per one image. If I try more then one image per run, it sometimes crashes.

what slows it down so much? My comp isn't that bad.

Logs

No response

Other

No response

DenkingOfficial commented 4 weeks ago

What resolution are you using?

Also, which exactly model are you using: fp16, fp8, nf4, GGUF variants?

Michlozz commented 4 weeks ago

What resolution are you using?

Also, which exactly model are you using: fp16, fp8, nf4, GGUF variants?

fp8, resolution of 1024X1024

DenkingOfficial commented 4 weeks ago

fp8, resolution of 1024X1024

This is normal speed on 8GB VRAM and fp8, try GGUF quantized models

For me (RTX 2060 Super, 8GB VRAM) it takes 39 seconds for 1920x1080 image on schnell Q4_0 model (4 steps)