Closed makeoo1 closed 1 month ago
You will experience these issues if you haven't configured your NVIDIA drivers to not fallback onto system memory in low VRAM situations. See https://nvidia.custhelp.com/app/answers/detail/a_id/5490/~/system-memory-fallback-for-stable-diffusion for more information. (This change was introduced relatively recently on the driver side.)
Try becnhmarks for CPU and GPU and RAM to see if one of them breaks indicating some hardware problem?
Try becnhmarks for CPU and GPU and RAM to see if one of them breaks indicating some hardware problem?
did that, everything is fine
You will experience these issues if you haven't configured your NVIDIA drivers to not fallback onto system memory in low VRAM situations. See https://nvidia.custhelp.com/app/answers/detail/a_id/5490/~/system-memory-fallback-for-stable-diffusion for more information. (This change was introduced relatively recently on the driver side.)
from what i am reading here this seems something that it's indicated for GPUs with not a lot of VRAM and the article it's strictly related to stable diffusion. I got a 4090 with 24gb of VRAM so i don't think i should be running out of it and from what i have seen this problem it extends alsoto other app (Davinci, A1111, Prpro). Also it's not clear to me why would this problem occur in the middle of a session and i never had it before Can you please help to understand? Thanks a lot!
@makeoo1 I have the same amount of VRAM; similar symptoms too. I highly recommend you try configuring the fallback behavior. Windows (DWM) is probably eating ~4GB right off the bat, so you may be underestimating how much VRAM gets used during encoding, AI, and other scenarios.
@riverar
Thanks for explaining! that did indeed help and it seems that is working fine now. I haven't stress-test it yet with a big project or something like that but i will update this post as soon as i do that.
Do you mind if I ask you why do you think this happened only now and in the middle of a session considering that i was using confyui with no problems for long time?
Can i also ask you what do you think about this old issue where they talk about a similar problem? do you see any similarity? https://github.com/comfyanonymous/ComfyUI/issues/888
Thank you again for the help
@makeoo1 This change on the driver side went out sometime around June-Oct 2023 (depends on which driver set you use). I believe anything before behaves similar to "no sysmem fallback". So your apps during that time period would have been working normally.
I don't know enough about the AI space to explain why this is happening. Interestingly, Task Manager is reporting low VRAM usage throughout so perhaps there are spikes that are getting lost in the graphs? Or perhaps the driver is incorrectly anticipating low VRAM? Definitely worth investigating--it makes me wonder if "high VRAM requirements" in the AI space could be a partial misunderstanding of a bug in some library.
I am now getting a new error when encoding images from a video file. So far this is the first error i have seen since i disable the system fallback.
To see the GUI go to: http://127.0.0.1:8188 FETCH DATA from: C:_ComfyUi\ComfyUI\custom_nodes\ComfyUI-Manager\extension-node-map.json [comfy_mtb] | WARNING -> Found multiple match, we will pick the first C:_ComfyUi\ComfyUI\models\upscale_models ['C:\_ComfyUi\ComfyUI\models\upscale_models', 'C:/SD/stable-diffusion-webui/models/ESRGAN', 'C:/SD/stable-diffusion-webui/models/RealESRGAN', 'C:/SD/stable-diffusion-webui/models/SwinIR'] got prompt ERROR:root:Failed to validate prompt for output 47: ERROR:root:* PixelKSampleUpscalerProvider 36: ERROR:root: - Required input is missing: model ERROR:root:Output will be ignored ERROR:root:Failed to validate prompt for output 45: ERROR:root:Output will be ignored [rgthree] Using rgthree's optimized recursive execution. [rgthree] First run patching recursive_output_delete_if_changed and recursive_will_execute. [rgthree] Note: If execution seems broken due to forward ComfyUI changes, you can disable the optimization from rgthree settings in ComfyUI. Requested to load CLIPVisionModelProjection Loading 1 new model model_type EPS adm 0 Using xformers attention in VAE Working with z of shape (1, 4, 32, 32) = 4096 dimensions. Using xformers attention in VAE clip missing: ['clip_l.text_projection', 'clip_l.logit_scale'] clip unexpected: ['clip_l.transformer.text_model.embeddings.position_ids'] left over keys: dict_keys(['model_ema.decay', 'model_ema.num_updates']) [AnimateDiffEvo] - INFO - Loading motion module mm_sd_v15_v2.ckpt via Gen2 Requested to load SD1ClipModel Loading 1 new model Global Step: 20260 [] [] Requested to load AutoencoderKL Loading 1 new model fatal : Memory allocation failure fatal : Memory allocation failure ERROR:root:!!! Exception during processing !!! ERROR:root:Traceback (most recent call last): File "C:_ComfyUi\ComfyUI\execution.py", line 152, in recursive_execute output_data, output_ui = get_output_data(obj, input_data_all) File "C:_ComfyUi\ComfyUI\execution.py", line 82, in get_output_data return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True) File "C:_ComfyUi\ComfyUI\execution.py", line 75, in map_node_over_list results.append(getattr(obj, func)(slice_dict(input_data_all, i))) File "C:_ComfyUi\ComfyUI\nodes.py", line 313, in encode t = vae.encode(pixels[:,:,:,:3]) File "C:_ComfyUi\ComfyUI\comfy\sd.py", line 314, in encode samples[x:x+batch_number] = self.first_stage_model.encode(pixels_in).to(self.output_device).float() File "C:_ComfyUi\ComfyUI\comfy\ldm\models\autoencoder.py", line 181, in encode z = self.encoder(x) File "C:\Users\M\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl return self._call_impl(*args, *kwargs) File "C:\Users\M\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl return forward_call(args, kwargs) File "C:_ComfyUi\ComfyUI\comfy\ldm\modules\diffusionmodules\model.py", line 523, in forward h = self.down[i_level].block[i_block](h, temb) File "C:\Users\M\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl return self._call_impl(*args, kwargs) File "C:\Users\M\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl return forward_call(*args, *kwargs) File "C:_ComfyUi\ComfyUI\comfy\ldm\modules\diffusionmodules\model.py", line 142, in forward h = self.conv1(h) File "C:\Users\M\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl return self._call_impl(args, kwargs) File "C:\Users\M\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl return forward_call(*args, *kwargs) File "C:_ComfyUi\ComfyUI\comfy\ops.py", line 60, in forward return super().forward(args, **kwargs) File "C:\Users\M\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\modules\conv.py", line 460, in forward return self._conv_forward(input, self.weight, self.bias) File "C:\Users\M\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\modules\conv.py", line 456, in _conv_forward return F.conv2d(input, weight, bias, self.stride, RuntimeError: GET was unable to find an engine to execute this computation
Prompt executed in 21.50 seconds
new error now when combining frames and then comfyui just crashes ....
To see the GUI go to: http://127.0.0.1:8188 FETCH DATA from: C:_ComfyUi\ComfyUI\custom_nodes\ComfyUI-Manager\extension-node-map.json [comfy_mtb] | WARNING -> Found multiple match, we will pick the first C:_ComfyUi\ComfyUI\models\upscale_models ['C:\_ComfyUi\ComfyUI\models\upscale_models', 'C:/SD/stable-diffusion-webui/models/ESRGAN', 'C:/SD/stable-diffusion-webui/models/RealESRGAN', 'C:/SD/stable-diffusion-webui/models/SwinIR'] Exception in callback _ProactorBasePipeTransport._call_connection_lost(None) handle: <Handle _ProactorBasePipeTransport._call_connection_lost(None)> Traceback (most recent call last): File "C:\Users\M\AppData\Local\Programs\Python\Python310\lib\asyncio\events.py", line 80, in _run self._context.run(self._callback, *self._args) File "C:\Users\M\AppData\Local\Programs\Python\Python310\lib\asyncio\proactor_events.py", line 162, in _call_connection_lost self._sock.shutdown(socket.SHUT_RDWR) ConnectionResetError: [WinError 10054] An existing connection was forcibly closed by the remote host got prompt [rgthree] Using rgthree's optimized recursive execution. [rgthree] First run patching recursive_output_delete_if_changed and recursive_will_execute. [rgthree] Note: If execution seems broken due to forward ComfyUI changes, you can disable the optimization from rgthree settings in ComfyUI. Requested to load CLIPVisionModelProjection Loading 1 new model model_type EPS adm 0 Using xformers attention in VAE Working with z of shape (1, 4, 32, 32) = 4096 dimensions. Using xformers attention in VAE clip missing: ['clip_l.text_projection', 'clip_l.logit_scale'] clip unexpected: ['clip_l.transformer.text_model.embeddings.position_ids'] left over keys: dict_keys(['model_ema.decay', 'model_ema.num_updates']) [AnimateDiffEvo] - INFO - Loading motion module mm_sd_v15_v2.ckpt [AnimateDiffEvo] - INFO - Loading motion LoRA v2_lora_ZoomOut.ckpt Global Step: 10000 [AnimateDiffEvo] - INFO - Applying a v2 LoRA (v2_lora_ZoomOut.ckpt) to a v2 motion model. Requested to load SD1ClipModel Loading 1 new model [] [] Requested to load AutoencoderKL Loading 1 new model [AnimateDiffEvo] - INFO - Sliding context window activated - latents passed in (24) greater than context_length 16. [AnimateDiffEvo] - INFO - Using motion module mm_sd_v15_v2.ckpt:v2. Requested to load BaseModel Requested to load ControlNet Requested to load AnimateDiffModel Loading 3 new models 100%|██████████████████████████████████████████████████████████████████████████████████| 15/15 [00:24<00:00, 1.62s/it] Using xformers attention in VAE Working with z of shape (1, 4, 32, 32) = 4096 dimensions. Using xformers attention in VAE Leftover VAE keys ['model_ema.decay', 'model_ema.num_updates'] Requested to load AutoencoderKL Loading 1 new model ERROR:root:!!! Exception during processing !!! ERROR:root:Traceback (most recent call last): File "C:_ComfyUi\ComfyUI\execution.py", line 152, in recursive_execute output_data, output_ui = get_output_data(obj, input_data_all) File "C:_ComfyUi\ComfyUI\execution.py", line 82, in get_output_data return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True) File "C:_ComfyUi\ComfyUI\execution.py", line 75, in map_node_over_list results.append(getattr(obj, func)(slice_dict(input_data_all, i))) File "C:_ComfyUi\ComfyUI\custom_nodes\ComfyUI-VideoHelperSuite\videohelpersuite\nodes.py", line 326, in combine_video images = tensor_to_bytes(images) File "C:_ComfyUi\ComfyUI\custom_nodes\ComfyUI-VideoHelperSuite\videohelpersuite\nodes.py", line 93, in tensor_to_bytes return tensor_to_int(tensor, 8).astype(np.uint8) File "C:_ComfyUi\ComfyUI\custom_nodes\ComfyUI-VideoHelperSuite\videohelpersuite\nodes.py", line 89, in tensor_to_int return np.clip(tensor, 0, (2bits-1)) File "<__array_function__ internals>", line 180, in clip File "C:\Users\M\AppData\Local\Programs\Python\Python310\lib\site-packages\numpy\core\fromnumeric.py", line 2154, in clip return _wrapfunc(a, 'clip', a_min, a_max, out=out, kwargs) File "C:\Users\M\AppData\Local\Programs\Python\Python310\lib\site-packages\numpy\core\fromnumeric.py", line 57, in _wrapfunc return bound(*args, *kwds) File "C:\Users\M\AppData\Local\Programs\Python\Python310\lib\site-packages\numpy\core_methods.py", line 160, in _clip return _clip_dep_invoke_with_casting( File "C:\Users\M\AppData\Local\Programs\Python\Python310\lib\site-packages\numpy\core_methods.py", line 114, in _clip_dep_invoke_with_casting return ufunc(args, out=out, kwargs) numpy.core._exceptions._ArrayMemoryError: Unable to allocate 102. MiB for an array with shape (24, 448, 832, 3) and data type float32
Prompt executed in 38.60 seconds got prompt ERROR:root:Failed to validate prompt for output 44: ERROR:root: (prompt): ERROR:root: - Required input is missing: images ERROR:root: VHS_VideoCombine 44: ERROR:root: - Required input is missing: images ERROR:root:Output will be ignored [rgthree] Using rgthree's optimized recursive execution. Global Step: 20260 [] [] [AnimateDiffEvo] - INFO - Sliding context window activated - latents passed in (24) greater than context_length 16. [AnimateDiffEvo] - INFO - Using motion module mm_sd_v15_v2.ckpt:v2. Requested to load ControlNet Loading 1 new model 100%|██████████████████████████████████████████████████████████████████████████████████| 22/22 [00:34<00:00, 1.58s/it]
C:_ComfyUi\ComfyUI> C:_ComfyUi\ComfyUI>
after restarting comfyui:
To see the GUI go to: http://127.0.0.1:8188 FETCH DATA from: C:_ComfyUi\ComfyUI\custom_nodes\ComfyUI-Manager\extension-node-map.json [comfy_mtb] | WARNING -> Found multiple match, we will pick the first C:_ComfyUi\ComfyUI\models\upscale_models ['C:\_ComfyUi\ComfyUI\models\upscale_models', 'C:/SD/stable-diffusion-webui/models/ESRGAN', 'C:/SD/stable-diffusion-webui/models/RealESRGAN', 'C:/SD/stable-diffusion-webui/models/SwinIR'] got prompt ERROR:root:Failed to validate prompt for output 44: ERROR:root: (prompt): ERROR:root: - Required input is missing: images ERROR:root: VHS_VideoCombine 44: ERROR:root: - Required input is missing: images ERROR:root:Output will be ignored ERROR:root:Failed to validate prompt for output 40: ERROR:root: (prompt): ERROR:root: - Required input is missing: images ERROR:root: VHS_VideoCombine 40: ERROR:root: - Required input is missing: images ERROR:root:Output will be ignored [rgthree] Using rgthree's optimized recursive execution. [rgthree] First run patching recursive_output_delete_if_changed and recursive_will_execute. [rgthree] Note: If execution seems broken due to forward ComfyUI changes, you can disable the optimization from rgthree settings in ComfyUI. Requested to load CLIPVisionModelProjection Loading 1 new model model_type EPS adm 0 Using xformers attention in VAE Working with z of shape (1, 4, 32, 32) = 4096 dimensions. Using xformers attention in VAE clip missing: ['clip_l.text_projection', 'clip_l.logit_scale'] clip unexpected: ['clip_l.transformer.text_model.embeddings.position_ids'] left over keys: dict_keys(['model_ema.decay', 'model_ema.num_updates']) [AnimateDiffEvo] - INFO - Loading motion module mm_sd_v15_v2.ckpt [AnimateDiffEvo] - INFO - Loading motion LoRA v2_lora_ZoomOut.ckpt Global Step: 10000 [AnimateDiffEvo] - INFO - Applying a v2 LoRA (v2_lora_ZoomOut.ckpt) to a v2 motion model. Requested to load SD1ClipModel Loading 1 new model [] [] Requested to load AutoencoderKL Loading 1 new model [AnimateDiffEvo] - INFO - Sliding context window activated - latents passed in (24) greater than context_length 16. [AnimateDiffEvo] - INFO - Using motion module mm_sd_v15_v2.ckpt:v2. Requested to load BaseModel Requested to load ControlNet Requested to load AnimateDiffModel Loading 3 new models 100%|██████████████████████████████████████████████████████████████████████████████████| 15/15 [00:24<00:00, 1.62s/it] Using xformers attention in VAE Working with z of shape (1, 4, 32, 32) = 4096 dimensions. Using xformers attention in VAE Leftover VAE keys ['model_ema.decay', 'model_ema.num_updates'] Requested to load AutoencoderKL Loading 1 new model
C:_ComfyUi\ComfyUI> C:_ComfyUi\ComfyUI>
2 things.. first, you have a million things running in the bg, kill emall no matter how much ram or how deep your pockets.. second, use a virtual environment for each ui, python likes everything neat and tidy... and realize it's not standard win software,installing every node and repo is a bad idea, just git the good ones, and keep them updated, or you see where you eventually end up.. save models and workflows and reinstall man, and don't blame comfyui, it's a goddamn fuckin great piece of software
@carolekon i am not "blaming" on the software .... i am just telling you what happened and how. Also i have said already that it was a pretty odd and unclear situation. If i wrote tha title the way i did it's couse i wanted people to read it so i could get an help and also for other people to get informed about the issue.
I had an observation that if I switched off LM Studio, ComfyUI quickly gained speed and continued. As a stop gap solution: now I plan to do things in batches (finish all LM studio work and save as a metadata and stop LM Studio) and then do all ComfyUI tasks. But I will definitely try the System Memory Fallback for Stable Diffusion solution: https://nvidia.custhelp.com/app/answers/detail/a_id/5490/~/system-memory-fallback-for-stable-diffusion
This issue is being marked stale because it has not had any activity for 30 days. Reply below within 7 days if your issue still isn't solved, and it will be left open. Otherwise, the issue will be closed automatically.
I just recently found this problem out and i tried to verify it as much as i could before opening another issue. OLD POST : https://github.com/comfyanonymous/ComfyUI/issues/2873
during a regular rendering session for no particular reason Comfyui drained my RAM and used 99% of it cosing my PC to start freezing for like 10/15 seconds, just the time for me to realize what was happening and closing the CMD. When i restarted comfyui i was having this problem that looked like a memory leak, since comfyui was exponentially consuming RAM without giving it back after every render or some other processes. Also if changed workflow or tried to refresh the interface the RAM wasn't being freed. The only way is to close the CMD and restart comfyui.
Now the BIG problem is that i am having a similar problem also with few other apps of my PC like Davinci Resolve or even with Automatic 1111 (in the case of A1111 i was getting an error about CUDA when i was simply trying to load another model). What it looked like a memory leak only related to comfyui now it's going on also on other apps. Apprently it is not happening with some other apps like videogames or photoshop for example and rather then that the behavior of the PC looks normal.
So far it has been pretty hard to understand the nature of this problem but it seems related to RAM or VRAM for my understating. I ran the Windows 10 internal RAM test and no error came up and I also did some benchmarks for the gpu which also came out ok so it makes me think it is not and hardwere problem.**
This whole thing is pretty confusing also to me, all i know is that it started after the episode that i explained before, so please forgive if i am not explaining something correctly but i am talking about stuff i don't know. I can understand that this can seem pretty wired.
I don't want to alarm anybody with this post, i am just trying to explain what happened in detail
I hope this will get to the devs or if anyone has an idea would be much appreciated! Thanks for taking the time
my specs: cpu AMD Ryzen 9 5900X gpu Nvida rtx 4090 ram 32gb Kingstone 16gb x2 system Windows 10
these screenshots ware taken after rendering an animation with comfyui when i first started to get this issue