city96 / ComfyUI-GGUF

GGUF Quantization support for native ComfyUI models
Apache License 2.0
950 stars 59 forks source link

custom_nodes\ComfyUI-GGUF\nodes.py:79: UserWarning: The given NumPy array is not writable #95

Open Sambouza opened 2 months ago

Sambouza commented 2 months ago

New to image generation in general here and after setting up stable diffusion I found out about flux. I am using an RTX 3060TI with 8GBs of VRAM and I saw others getting flux-dev to work with smaller cards, so I'm not sure if my card is the weak link.

The error seemed about the ComfyUI-GGUF extension so I want to post this here to get help. I will link the workflow and log file

image comfyui.log

blepping commented 1 month ago

it's just a warning, you can safely ignore it.

Sambouza commented 1 month ago

then I'm not sure what's causing comfyui to completely crash (alongside my desktop environment).

hopefully future models of flux and future updates for comfyui fixes instability

blepping commented 1 month ago

(alongside my desktop environment).

that sounds much more like you're running out of VRAM, definitely would not have anything to do with the NumPy warning.

ComfyUI recently added an option to reserve some amount of VRAM. you might try increasing that.

Sambouza commented 1 month ago

that's weird, I'm using quantized versions of flux (gguf K_0) and I saw people with 4gb vram cards running them.

I have an 8GB 3060Ti. I would definetely try reserving vram because I don't see it getting utilized in task manager. only the Ram.

blepping commented 1 month ago

if your DE is crashing you're already in the territory of weird. it can depend on the OS and other factors, when i had an 8GB card i had similar issues on Linux running SDXL.

if you don't have much system RAM you might also be experiencing issues there, i think currently the GGUF stuff isn't mmaping so you likely also need enough RAM to hold the whole model + whatever else is going on.

Sambouza commented 1 month ago

Probably because of my system RAM, it's an old DDR4 16GB dual stick and in task manager it gets utilized to 100%.

So should I upgrade to a DDR5 32GB dual stick for my workflows then? I already plan to upgrade my SSD and RAM in a few months.

blepping commented 1 month ago

Probably because of my system RAM, it's an old DDR4 16GB dual stick and in task manager it gets utilized to 100%.

if you mean you have a total of 16GB RAM then yeah, that's fairly likely to be the issue. even the Q4 quantized Flux is nearly half your total system RAM, if stuff is possibly getting copied extra times and you have other apps running as well (DE, web browser, etc) you could end up in swap hell pretty easily.

So should I upgrade to a DDR5 32GB dual stick for my workflows then?

DDR5 uses a different slot than DDR4 if i remember correctly, so unless you want to upgrade your motherboard (and probably CPU too) you'd have to stick with the same type of RAM as you have currently. i don't think DDR5 vs DDR4 would really make much difference for this sort of stuff and DDR4 should be quite cheap.

note: you should do your own research/compatibility checking before you decide to actually go out and buy stuff, don't just do it based on what some random person on the internet says. :)

Sambouza commented 1 month ago

as I said, I already planned to get these parts. I already checked on compatibility before opening this issues. and I specify DDR just incase there is some weird instability/bug that comfyui has because it seems to have a lot of low level problems (at least in my experience).

And yeah, as I also said, it absolutely kills my RAM. and even if upgrading did not work, I have other "creative" applications/workflows that I personally need, like caching in blender/premier and running emulators and such.

but thanks for your help and quick responses! I have not encountered any person with this exact issue. the flux models look amazing, I have only got the chance to load the fp4 model once and never again which leads me to think I'm having low RAM headrooms.

blepping commented 1 month ago

as I said, I already planned to get these parts.

you actually never mentioned upgrading anything other than RAM sticks. feel free to read back through the conversation and you can verify that is the case. if you were already planning to upgrade your motherboard to a DDR5 compatible one, that's great - but there's no way i could have known. just trying to make sure you didn't accidentally buy DIMMs you couldn't actually use.

kovern commented 1 month ago

Same issue here. I have a laptop with ubuntu22.04 + win10, and a 8gb rtx3070. Under ubuntu, everything is fine. But today i try to start comfy on windows. Installed fresh python, fresh torch (2.4.1+cu124) fresh comfy, and got the same warning: UserWarning: The given NumPy array is not writable, and PyTorch does not support non-writable tensors

The gguf loader started anyway, but whithin milliseonds it crashed.

What is further interesting to me, that theoretically comfy support split loading of the flux model, so actually I wouldn't even need the gguf model, but when i let comfy handle the split model loading (by choosing the fp8 or fp16 version) it crashes the same way, silently, without any message.

Sambouza commented 1 month ago

Probably because of my system RAM, it's an old DDR4 16GB dual stick and in task manager it gets utilized to 100%.

So should I upgrade to a DDR5 32GB dual stick for my workflows then? I already plan to upgrade my SSD and RAM in a few months.

Not sure what you're talking about because I clearly stated that I'm planning to upgrade. Maybe you have skimmed through my messages but that's fine. When I say "already planned" I omitted that I checked all components were compatible.

And I'm really happy I'm not the only one who's having this issue 😁

I'm not sure if I can share anything useful other than logs which don't seem to say the error once the NumPy warning. it just simply crashes the .bat script let me know kovern if you found a fix for you! I simply don't have the time anymore to troubleshoot

al-swaiti commented 1 month ago

did u try thispip install --force numpy==1.26.4

city96 commented 1 month ago

There's an initial spike during the first load on windows with mmap, this is a known issue. I think it happens with safetensors as well? In any case, my guess would be that you're running out of system memory and not vram.

Adding more swap/pagefile will fix it since it should be released after the first time it's loaded ("attempting to release mmap" message in console) but this will cause wear on your SSD so I wouldn't recommend it. It'd also be pretty slow.

There's been reports of pytorch 2.4.X causing issues on ComfyUI (except for the cu124 one?) so may be worth seeing if the 2.3.X or the nightly 2.5.X releases help.

kovern commented 1 month ago

There's an initial spike during the first load on windows with mmap, this is a known issue. I think it happens with safetensors as well? In any case, my guess would be that you're running out of system memory and not vram.

Adding more swap/pagefile will fix it since it should be released after the first time it's loaded ("attempting to release mmap" message in console) but this will cause wear on your SSD so I wouldn't recommend it. It'd also be pretty slow.

There's been reports of pytorch 2.4.X causing issues on ComfyUI (except for the cu124 one?) so may be worth seeing if the 2.3.X or the nightly 2.5.X releases help.

Thanks for the answer. I have 32GB RAM, and I didn't saw any sign of memory spike. In any case, I imagine 32GB should be enough :)

Already tried several pytorch without success but will try the nightly.

city96 commented 1 month ago

@kovern I guess you could play with the --reserve-vram launch arg to see if that does anything to help. Possibly also --disable-cuda-malloc but that'll usually make it worse (and I don't know if it does anything on linux or not).

al-swaiti commented 1 month ago

I think he use portable comfyui on windows

anthonyaquino83 commented 1 month ago

Hi, guys... I have the same problem, RTX 3060 6GB, it shows "The given NumPy array is not writable" and it crashes portable ComfyUI. So I installed Forge inside Pinokio and I tried flux1-schnell-Q4_0.gguf and it generated a 1024x1024 image after 37.4s using the prompt "cat in a hat". Maybe it's some problem with some ComfyUI node.

city96 commented 1 month ago

Linking comment from the other issue which is most likely the same as this - it seems related to the amount of pagefile present: https://github.com/city96/ComfyUI-GGUF/issues/102#issuecomment-2336637611

skimy2023 commented 1 month ago

Hi, guys... I have the same problem, RTX 3060 6GB, it shows "The given NumPy array is not writable" and it crashes portable ComfyUI. So I installed Forge inside Pinokio and I tried flux1-schnell-Q4_0.gguf and it generated a 1024x1024 image after 37.4s using the prompt "cat in a hat". Maybe it's some problem with some ComfyUI node.

Could you please share how you installed Forge inside Pinokio? I'd appreciate any specific steps you can provide.

city96 commented 1 month ago

Possibly fixed on latest by the commit above that references this issue.

anthonyaquino83 commented 1 month ago

Hi, guys... I have the same problem, RTX 3060 6GB, it shows "The given NumPy array is not writable" and it crashes portable ComfyUI. So I installed Forge inside Pinokio and I tried flux1-schnell-Q4_0.gguf and it generated a 1024x1024 image after 37.4s using the prompt "cat in a hat". Maybe it's some problem with some ComfyUI node.

Could you please share how you installed Forge inside Pinokio? I'd appreciate any specific steps you can provide.

Yep, I just used the one click installer for Forge available in this page: https://pinokio.computer/app

anthonyaquino83 commented 1 month ago

Possibly fixed on latest by the commit above that references this issue.

I just updated comfyui using the update all option and flux is running now in my comfyui too, thanks for the fix. 🥳

Sambouza commented 1 month ago

It appears to work now too, but it is still slightly inconsistent. Maybe 1/3rd of the time it does work without crashing. I will look into it and I will close this issue tomorrow incase anyone wants to share anything to increase stability

Donzox commented 1 month ago

It appears to work now too, but it is still slightly inconsistent. Maybe 1/3rd of the time it does work without crashing. I will look into it and I will close this issue tomorrow incase anyone wants to share anything to increase stability

How did you get it to work? Ever since I got the message ComfyUI-GGUF\nodes.py:79: UserWarning: The given NumPy array is not writable, and PyTorch does not support non-writable tensors. now the GGUF model is slightly slower than Flux Dev FP8. I don't get it, there was no problems two days ago.

Donzox commented 1 month ago

Tried Hyper-FLUX.1-dev-8steps + Flux Dev fp8 = 1.21s/it & Hyper-FLUX.1-dev-8steps + Q8 = 2.37s/it

Sambouza commented 1 month ago

It appears to work now too, but it is still slightly inconsistent. Maybe 1/3rd of the time it does work without crashing. I will look into it and I will close this issue tomorrow incase anyone wants to share anything to increase stability

How did you get it to work? Ever since I got the message ComfyUI-GGUF\nodes.py:79: UserWarning: The given NumPy array is not writable, and PyTorch does not support non-writable tensors. now the GGUF model is slightly slower than Flux Dev FP8. I don't get it, there was no problems two days ago.

Try updating comfyUI, as per someone suggested there was a patch to comfyUI referencing this issue. fp8 models do not seem to work for me, just gguf models. I only had the chance to run the fp8 model once

Donzox commented 1 month ago

Try updating comfyUI, as per someone suggested there was a patch to comfyUI referencing this issue. fp8 models do not seem to work for me, just gguf models. I only had the chance to run the fp8 model once

I updated all, reinstalled ComfyUI-GGUF and made sure the requirements are updated also. The error still won't go away.

city96 commented 1 month ago

The actual numpy message can be ignored, it's just a warning as is mentioned above.

As for actually crashing, yeah, I assume it'll still happen if it can't allocate enough pagefile. Can't really think of any easy way to bring it down further, especially in cases where model doesn't fit completely into vram.