Open aesxsc opened 2 months ago
Now Flux Dev doesn't work too. They both crash the ComfyUI
They both crash for me. I'm also usign the venv from stableDiffusion webUi...
I've created a new venv and installed everything from scratch and got the same error.
have you tried downgrading torch?
What's the recomended version?
They both crash for me. I'm also usign the venv from stableDiffusion webUi...
We both have 4070 Ti SUPER's, maybe that's the problem? I also tried in Arch Linux. It crashes the whole DE.
Possibly not enough RAM/swap/pagefile to load the model at the given precision? Usually when a process is "Killed" in Linux, it's to prevent an out of memory situation that would lock the system up. I'd suggest looking into creating or enlarging a swap file. https://wiki.archlinux.org/title/Swap#Swap_file_creation
Try 8GB and see if that's enough, then go up by 4GB until the python process isn't killed on loading.
It used to work just fine a few days ago. Also I have a 32GB swapfile + 32GB RAM so I don't think it is the case. On Windows I have the pagefile set to Auto, which I don't think matters again.
Then I can only suggest running git reflog
and go back on commits until it works again. It should be fairly easy to determine which commit issues started.
Alternatively the problem may be occurring by being on a commit that came just before https://github.com/comfyanonymous/ComfyUI/commit/b334605a6631c12bbe7b3aff6d77526f47acdf42 as this commit addresses OOMs dealing with erroneous model loading.
I pulled the latest commit, it still happens. Also, I couldn't exactly find which commit actually broke it.
Portable version is broken too.
Both FP16 and FP8? I have only FP16 downloaded. It does not even attempt to load the checkpoint into RAM. (--lowvram)
I have only FP16 too. Haven't tried FP8.
Other Stable Diffusion models don't crash Comfy, only Flux models crash it.
Same problem. Disabling custom nodes does nothing so I copied output with the nodes active which contains environment info.
[START] Security scan
[DONE] Security scan
## ComfyUI-Manager: installing dependencies done.
** ComfyUI startup time: 2024-08-09 17:42:52.116158
** Platform: Windows
** Python version: 3.10.11 (tags/v3.10.11:7d4cc5a, Apr 5 2023, 00:38:17) [MSC v.1929 64 bit (AMD64)]
** Python executable: F:\stability\Data\Packages\ComfyUI\venv\Scripts\python.exe
** ComfyUI Path: F:\stability\Data\Packages\ComfyUI
** Log path: F:\stability\Data\Packages\ComfyUI\comfyui.log
Prestartup times for custom nodes:
4.0 seconds: F:\stability\Data\Packages\ComfyUI\custom_nodes\ComfyUI-Manager
Total VRAM 12282 MB, total RAM 32468 MB
pytorch version: 2.1.2+cu121
Set vram state to: LOW_VRAM
Device: cuda:0 NVIDIA GeForce RTX 4080 Laptop GPU : cudaMallocAsync
Using pytorch cross attention
[Prompt Server] web root: F:\stability\Data\Packages\ComfyUI\web
Adding extra search path checkpoints F:\stability\Data\Models\StableDiffusion
Adding extra search path vae F:\stability\Data\Models\VAE
Adding extra search path loras F:\stability\Data\Models\Lora
Adding extra search path loras F:\stability\Data\Models\LyCORIS
Adding extra search path upscale_models F:\stability\Data\Models\ESRGAN
Adding extra search path upscale_models F:\stability\Data\Models\RealESRGAN
Adding extra search path upscale_models F:\stability\Data\Models\SwinIR
Adding extra search path embeddings F:\stability\Data\Models\TextualInversion
Adding extra search path hypernetworks F:\stability\Data\Models\Hypernetwork
Adding extra search path controlnet F:\stability\Data\Models\ControlNet
Adding extra search path controlnet F:\stability\Data\Models\T2IAdapter
Adding extra search path clip F:\stability\Data\Models\CLIP
Adding extra search path clip_vision F:\stability\Data\Models\InvokeClipVision
Adding extra search path diffusers F:\stability\Data\Models\Diffusers
Adding extra search path gligen F:\stability\Data\Models\GLIGEN
Adding extra search path vae_approx F:\stability\Data\Models\ApproxVAE
Adding extra search path ipadapter F:\stability\Data\Models\IpAdapter
Adding extra search path ipadapter F:\stability\Data\Models\InvokeIpAdapters15
Adding extra search path ipadapter F:\stability\Data\Models\InvokeIpAdaptersXl
Adding extra search path prompt_expansion F:\stability\Data\Models\PromptExpansion
[Crystools INFO] Crystools version: 1.16.6
[Crystools INFO] CPU: 13th Gen Intel(R) Core(TM) i9-13950HX - Arch: AMD64 - OS: Windows 10
[Crystools INFO] Pynvml (Nvidia) initialized.
[Crystools INFO] GPU/s:
[Crystools INFO] 0) NVIDIA GeForce RTX 4080 Laptop GPU
[Crystools INFO] NVIDIA Driver: 560.81
[inference_core_nodes.controlnet_preprocessors] | INFO -> Using ckpts path: F:\stability\Data\Packages\ComfyUI\custom_nodes\ComfyUI-Inference-Core-Nodes\src\inference_core_nodes\controlnet_preprocessors\ckpts
[inference_core_nodes.controlnet_preprocessors] | INFO -> Using symlinks: False
[inference_core_nodes.controlnet_preprocessors] | INFO -> Using ort providers: ['CUDAExecutionProvider', 'DirectMLExecutionProvider', 'OpenVINOExecutionProvider', 'ROCMExecutionProvider', 'CPUExecutionProvider', 'CoreMLExecutionProvider']
DWPose: Onnxruntime with acceleration providers detected
F:\stability\Data\Packages\ComfyUI\venv\lib\site-packages\diffusers\models\transformers\transformer_2d.py:34: FutureWarning: `Transformer2DModelOutput` is deprecated and will be removed in version 1.0.0. Importing `Transformer2DModelOutput` from `diffusers.models.transformer_2d` is deprecated and this will be removed in a future version. Please use `from diffusers.models.modeling_outputs import Transformer2DModelOutput`, instead.
deprecate("Transformer2DModelOutput", "1.0.0", deprecation_message)
### Loading: ComfyUI-Manager (V2.48.6)
### ComfyUI Revision: 2504 [55ad9d5f] | Released on '2024-08-09'
[ComfyUI-Manager] default cache updated: https://raw.githubusercontent.com/ltdrdata/ComfyUI-Manager/main/alter-list.json
[ComfyUI-Manager] default cache updated: https://raw.githubusercontent.com/ltdrdata/ComfyUI-Manager/main/model-list.json
Use STYLE(weight_interpretation, normalization) at the start of a prompt to use advanced encodings
Weight interpretations available: comfy,perp
Normalization types available: none
[ComfyUI-Manager] default cache updated: https://raw.githubusercontent.com/ltdrdata/ComfyUI-Manager/main/github-stats.json
[comfyui_controlnet_aux] | INFO -> Using ckpts path: F:\stability\Data\Packages\ComfyUI\custom_nodes\comfyui_controlnet_aux\ckpts
[comfyui_controlnet_aux] | INFO -> Using symlinks: False
[comfyui_controlnet_aux] | INFO -> Using ort providers: ['CUDAExecutionProvider', 'DirectMLExecutionProvider', 'OpenVINOExecutionProvider', 'ROCMExecutionProvider', 'CPUExecutionProvider', 'CoreMLExecutionProvider']
[ComfyUI-Manager] default cache updated: https://raw.githubusercontent.com/ltdrdata/ComfyUI-Manager/main/custom-node-list.json
[ComfyUI-Manager] default cache updated: https://raw.githubusercontent.com/ltdrdata/ComfyUI-Manager/main/extension-node-map.json
Import times for custom nodes:
0.0 seconds: F:\stability\Data\Packages\ComfyUI\custom_nodes\websocket_image_save.py
0.0 seconds: F:\stability\Data\Packages\ComfyUI\custom_nodes\sd-dynamic-thresholding
0.0 seconds: F:\stability\Data\Packages\ComfyUI\custom_nodes\comfyui-inpaint-nodes
0.0 seconds: F:\stability\Data\Packages\ComfyUI\custom_nodes\ComfyMath
0.0 seconds: F:\stability\Data\Packages\ComfyUI\custom_nodes\comfyui-tooling-nodes
0.0 seconds: F:\stability\Data\Packages\ComfyUI\custom_nodes\ComfyUI_IPAdapter_plus
0.0 seconds: F:\stability\Data\Packages\ComfyUI\custom_nodes\ComfyUI_ExtraModels
0.1 seconds: F:\stability\Data\Packages\ComfyUI\custom_nodes\comfyui_controlnet_aux
0.1 seconds: F:\stability\Data\Packages\ComfyUI\custom_nodes\comfyui-prompt-control
0.4 seconds: F:\stability\Data\Packages\ComfyUI\custom_nodes\ComfyUI_TensorRT
0.5 seconds: F:\stability\Data\Packages\ComfyUI\custom_nodes\ComfyUI-Crystools
1.7 seconds: F:\stability\Data\Packages\ComfyUI\custom_nodes\ComfyUI-Manager
1.9 seconds: F:\stability\Data\Packages\ComfyUI\custom_nodes\ComfyUI-Inference-Core-Nodes
Starting server
To see the GUI go to: http://127.0.0.1:8188
FETCH DATA from: F:\stability\Data\Packages\ComfyUI\custom_nodes\ComfyUI-Manager\extension-node-map.json [DONE]
got prompt
It does not continue. output form nvidia-smi
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 560.81 Driver Version: 560.81 CUDA Version: 12.6 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Driver-Model | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA GeForce RTX 4080 ... WDDM | 00000000:01:00.0 On | N/A |
| N/A 44C P8 4W / 175W | 1031MiB / 12282MiB | 2% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| 0 N/A N/A 1892 C+G C:\Windows\explorer.exe N/A |
| 0 N/A N/A 3588 C+G ...n\126.0.2592.113\msedgewebview2.exe N/A |
| 0 N/A N/A 8024 C+G ...0.0_x64__cv1g1gvanyjgm\WhatsApp.exe N/A |
| 0 N/A N/A 8332 C+G ...ekyb3d8bbwe\PhoneExperienceHost.exe N/A |
| 0 N/A N/A 10628 C+G ...ft Office\root\Office16\OUTLOOK.EXE N/A |
| 0 N/A N/A 10908 C+G ...al\Discord\app-1.0.9157\Discord.exe N/A |
| 0 N/A N/A 10912 C+G ...CBS_cw5n1h2txyewy\TextInputHost.exe N/A |
| 0 N/A N/A 11044 C+G ....0_x64__kzh8wxbdkxb8p\DCv2\DCv2.exe N/A |
| 0 N/A N/A 11528 C+G ...5n1h2txyewy\ShellExperienceHost.exe N/A |
| 0 N/A N/A 12700 C+G ...nt.CBS_cw5n1h2txyewy\SearchHost.exe N/A |
| 0 N/A N/A 12728 C+G ...2txyewy\StartMenuExperienceHost.exe N/A |
| 0 N/A N/A 14148 C+G F:\stability\StabilityMatrix.exe N/A |
| 0 N/A N/A 15608 C+G ...n\126.0.2592.113\msedgewebview2.exe N/A |
| 0 N/A N/A 15680 C+G ...__8wekyb3d8bbwe\Notepad\Notepad.exe N/A |
| 0 N/A N/A 16320 C+G ...n\NVIDIA app\CEF\NVIDIA Overlay.exe N/A |
| 0 N/A N/A 18692 C+G ...ys\WinUI3Apps\PowerToys.Peek.UI.exe N/A |
| 0 N/A N/A 18836 C+G ...werToys\PowerToys.PowerLauncher.exe N/A |
| 0 N/A N/A 19384 C+G ...werToys\PowerToys.ColorPickerUI.exe N/A |
| 0 N/A N/A 19800 C+G ...__8wekyb3d8bbwe\WindowsTerminal.exe N/A |
| 0 N/A N/A 22020 C+G ...\cef\cef.win7x64\steamwebhelper.exe N/A |
| 0 N/A N/A 22184 C+G ...les\Microsoft OneDrive\OneDrive.exe N/A |
| 0 N/A N/A 23308 C+G ...m Files (x86)\Overwolf\Overwolf.exe N/A |
| 0 N/A N/A 23604 C+G ...rwolf\0.256.0.2\OverwolfBrowser.exe N/A |
| 0 N/A N/A 24508 C+G ...ress\CefSharp.BrowserSubprocess.exe N/A |
| 0 N/A N/A 24684 C+G ...les\Microsoft OneDrive\OneDrive.exe N/A |
| 0 N/A N/A 25000 C+G ...\cef\cef.win7x64\steamwebhelper.exe N/A |
| 0 N/A N/A 26256 C+G ...crosoft\Edge\Application\msedge.exe N/A |
| 0 N/A N/A 30260 C+G ...oogle\Chrome\Application\chrome.exe N/A |
+-----------------------------------------------------------------------------------------+
Also rarely, not only ComfyUI but the whole GPU, Chrome, and CUDA runtime?(nvidia-smi does not work) crashes with it.
OS: Windows 11 22635
Happens with Arch too.
Only confyUI for me crashes. Nothing else is happening. Not even elevated RAM or VRAM usage.
Ran confyui through Nsight systems.
Logs: https://drive.google.com/file/d/1mEIIkvHAykUCHl_cFJt3AmMENlt_zWtQ/view?usp=sharing https://drive.google.com/file/d/10HXsy0A96zMALsPYRib59UrvSUSWkh_J/view?usp=drive_link
Problem will be with the drivers. Does not work on 560.81; worked in 560.70.
Edit: It is not Confyui or Nvidia drivers. Downgraded both and still does not work.
So it was fixed to me by moving my pagefile so it is both on C: to F: (Both on the same SSD).
I might have a Solution for everyone. (at least it woked out for me).
After struggling now for weeks i tryed out an even older versions for my GPU because i have seen, that these kind of problems are mainly on people with a "RTX 4070ti Super 16Gb". I was even about to send it back cause i tryed EVERYTHING...
However - with the Version: 551.23 for this GPU (See picture) my problems got solved!!!!
I really hope this Version is fixing also your problems. :)
I might have a Solution for everyone. (at least it woked out for me).
After struggling now for weeks i tryed out an even older versions for my GPU because i have seen, that these kind of problems are mainly on people with a "RTX 4070ti Super 16Gb". I was even about to send it back cause i tryed EVERYTHING...
However - with the Version: 551.23 for this GPU (See picture) my problems got solved!!!!
I really hope this Version is fixing also your problems. :)
I just got this card a week ago and I installed 560.70. Then after few days 560.81 got released and it still worked. After 1-2 days it completely stopped working on Flux models. I will to try to roll back to 560.70 and see if that was the problem.
Switched back to 560.70, doesn't work. Now I'll try 551.23 as suggested by @sabum6800 . Also why this specific version? Did you try every other version after that?
And... nope. Switched to 551.23, nothing really changed. Still crashes the same way.
It interestingly works on Arch Linux right now, latest drivers, latest commit.
i have a 2070 super 8gb vram and 32gb ram, latest drivers 560.81. comfy ui crashes after "got prompt", for flux dev, shnell and the FP8 versions. other models works for me. I have no idea on how to get the logs though.
i tried the portable and manual installations of comfyui, both have the same issue
Possibly not enough RAM/swap/pagefile to load the model at the given precision? Usually when a process is "Killed" in Linux, it's to prevent an out of memory situation that would lock the system up. I'd suggest looking into creating or enlarging a swap file. https://wiki.archlinux.org/title/Swap#Swap_file_creation
Try 8GB and see if that's enough, then go up by 4GB until the python process isn't killed on loading.
Thanks for mentioning this. I have a new Arch install myself and never allocated a swapfile. This fixed my issue that appears similar to this.
i have a 2070 super 8gb vram and 32gb ram, latest drivers 560.81. comfy ui crashes after "got prompt", for flux dev, shnell and the FP8 versions. other models works for me. I have no idea on how to get the logs though.
i tried the portable and manual installations of comfyui, both have the same issue
Yeah, it still doesn't work in Windows.
Try to set your page file with "system managed size".
It is already set to "system managed size". Also, the crash happens before even the model is loaded into RAM.
as mentioned in
duplicate of https://github.com/comfyanonymous/ComfyUI/issues/4198
try to start without --lowvram
. as to WHY this helps I still don't know yet.
I'm starting without --lowvram
but it automatically switches to lowvram
even though I use --normalvram
.
I think I found the problem and may be affected by the Intel Raptor Lake instability and degradation issue due to elevated operating voltage after all.
Before you do anything UPDATE YOUR BIOS OR YOU MAY DAMAGE YOUR CPU!
Update your BIOS before you do this and make sure it includes something like Update microcode 0x129 to address sporadic Vcore elevation behavior announced by Intel
.
The following models are affected:
13th gen:
i9-13900KS
i9-13900K
i9-13900KF
i9-13900F
i9-13900
i7-13700K
i7-13700KF
i7-13790F
i7-13700F
i7-13700
i5-13600K
i5-13600KF
14th gen:
i9-14900KS
i9-14900K
i9-14900KF
i9-14900F
i9-14900
i7-14700K
i7-14700KF
i7-14790F
i7-14700F
i7-14700
i5-14600K
i5-14600KF
Solution
Load a low-voltage profile in UEFI (I never tried this before because I assumed the BIOS defaults are fine):
If I use "e-core disabled" I am able to run Flux on --lowvram
I have a Ryzen CPU, a 3080ti with 12 GB RAM, system RAM 32 GB, and have the same problem. I tried with FP8, checkpoints, several workflows, upgrading COMFYUI, also pytorch to 2.4, etc. still the same issue. I noticed that it didn't try to load anything, nor uses my VRAM.
maybe it gives a hint to the error, as i dont have the time to go dive deep into where the error may come from exactly, but after having the same issues i tried several things like updating/downgrading etc. adding flux related modules and so on.
To me it seems like having the "weight_dtype" parameter of node "Load Diffusion Model" should not be set to default, but e.g. fp8_e5m2 instead. after switching the dtype it started to load instead of crashing right away.
I'm also on a 4070 super ti and have the latest version installed:
maybe its trying to convert/cast dtypes and the error lies there, but i cant tell without looking into it.
Other Stable Diffusion models don't crash Comfy, only Flux models crash it.
sdxl+flux same problem with rtx 3060 12gb
RTX 3090, same problem. Hangs at got prompt then nothing happens, no error messages. I have an older version in StabilityMatrix that runs fine, but does not support the latest enhancements for Flux
My case: Windows, RTX 3060, 12GB VRAM, 32GB RAM (+16GB swap), recent portable ComfyUI.
UNET version (file flux1-dev-fp8-e4m3fn.safetensors
/11GB loaded via "Load Diffusion Model" node) works good.
Checkpoint version (file flux1-dev-fp8.safetensors
/17GB loaded via "Load Checkpoint" node) - exit w/o errors after adding prompt to queue.
I increase swap to 32GB. Now checkpoint version works!
My case: Windows, RTX 3060, 12GB VRAM, 32GB RAM (+16GB swap), recent portable ComfyUI.
UNET version (file
flux1-dev-fp8-e4m3fn.safetensors
/11GB loaded via "Load Diffusion Model" node) works good. Checkpoint version (fileflux1-dev-fp8.safetensors
/17GB loaded via "Load Checkpoint" node) - exit w/o errors after adding prompt to queue.I increase swap to 32GB. Now checkpoint version works!
That worked, I increased as you suggested the swap to 32 GB, and it worked. What happens is that for some reason, before loading the model into VRAM, it actually consumed 44 GB of system RAM (which I did not have), then it loaded the model into 11 GB of VRAM and the system RAM usage went down to 12 GB. So that initial peak seems to be the problem at least for me.
Thanks for your answer.
Expected Behavior
To not crash.
Actual Behavior
ComfyUI crashes after 5-10 seconds i click queue prompt while using Flux Schnell. Such behavior does not happen with the Flux Dev model.
Steps to Reproduce
Use Flux Schnell model.
Debug Logs
Other
I'm using the venv from Stable Diffusion WebUI.
Nvidia GRD 560.81