Open memo opened 2 weeks ago
ipadapter is already offloading what is not needed, the problem shouldn't be much not the number of frames but the number of ipadapters. If you get OOM on the ipadapter node then you can try to the batch option that will encode the frames in chucks instead of all at the same time
Thanks for the quick reply Matteo, and for the wonderful extension and very informative videos!
I'm sorry but I don't fully understand what you mean by this:
then you can try to the batch option that will encode the frames in chucks instead of all at the same time
FYI When I run my workflow (1280x800, 15fps) on my 24GB 4090, I can run it to the end at 300 Frames no problem. I start getting OOM in the IPAdapters at 800 Frames. at 400-800 Frames the IPAdapters run fine, but they use so much VRAM that I get OOM in the KSampler.
Below is the full console dump of the OOM, and VRAM usage for different frame counts (in top left of each screenshot). The VRAM figures downstream of the node that OOM should be ignored as they are from the previous run. Before each run, I restarted comfyui and confirmed VRAM usage was at 0 in nvidia-smi.
I also made a table, because why not :) https://docs.google.com/spreadsheets/d/1G-uepKBogC0nJuAd5u2RrMzA-yT7H97rDlOXgvfxddQ/edit?usp=sharing
!!! Exception during processing !!! Allocation on device
Traceback (most recent call last):
File "C:\Users\Windows\Documents\MSA_local\ComfyUI_local\ComfyUI\execution.py", line 317, in execute
output_data, output_ui, has_subgraph = get_output_data(obj, input_data_all, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\Windows\Documents\MSA_local\ComfyUI_local\ComfyUI\execution.py", line 192, in get_output_data
return_values = _map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\Windows\Documents\MSA_local\ComfyUI_local\ComfyUI\execution.py", line 169, in _map_node_over_list
process_inputs(input_dict, i)
File "C:\Users\Windows\Documents\MSA_local\ComfyUI_local\ComfyUI\execution.py", line 158, in process_inputs
results.append(getattr(obj, func)(**inputs))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\Windows\Documents\MSA_local\ComfyUI_local\ComfyUI\custom_nodes\ComfyUI_IPAdapter_plus\IPAdapterPlus.py", line 822, in apply_ipadapter
work_model, face_image = ipadapter_execute(work_model, ipadapter_model, clip_vision, **ipa_args)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\Windows\Documents\MSA_local\ComfyUI_local\ComfyUI\custom_nodes\ComfyUI_IPAdapter_plus\IPAdapterPlus.py", line 359, in ipadapter_execute
img_cond_embeds = encode_image_masked(clipvision, image, batch_size=encode_batch_size, tiles=enhance_tiles, ratio=enhance_ratio, clipvision_size=clipvision_size)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\Windows\Documents\MSA_local\ComfyUI_local\ComfyUI\custom_nodes\ComfyUI_IPAdapter_plus\utils.py", line 242, in encode_image_masked
embeds = encode_image_masked_(clip_vision, image, mask, batch_size, clipvision_size=clipvision_size)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\Windows\Documents\MSA_local\ComfyUI_local\ComfyUI\custom_nodes\ComfyUI_IPAdapter_plus\utils.py", line 299, in encode_image_masked_
out = clip_vision.model(pixel_values=pixel_values, intermediate_output=-2)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\Windows\Documents\MSA_local\ComfyUI_local\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1532, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\Windows\Documents\MSA_local\ComfyUI_local\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1541, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\Windows\Documents\MSA_local\ComfyUI_local\ComfyUI\comfy\clip_model.py", line 194, in forward
x = self.vision_model(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\Windows\Documents\MSA_local\ComfyUI_local\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1532, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\Windows\Documents\MSA_local\ComfyUI_local\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1541, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\Windows\Documents\MSA_local\ComfyUI_local\ComfyUI\comfy\clip_model.py", line 183, in forward
x, i = self.encoder(x, mask=None, intermediate_output=intermediate_output)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\Windows\Documents\MSA_local\ComfyUI_local\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1532, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\Windows\Documents\MSA_local\ComfyUI_local\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1541, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\Windows\Documents\MSA_local\ComfyUI_local\ComfyUI\comfy\clip_model.py", line 69, in forward
x = l(x, mask, optimized_attention)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\Windows\Documents\MSA_local\ComfyUI_local\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1532, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\Windows\Documents\MSA_local\ComfyUI_local\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1541, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\Windows\Documents\MSA_local\ComfyUI_local\ComfyUI\comfy\clip_model.py", line 50, in forward
x += self.self_attn(self.layer_norm1(x), mask, optimized_attention)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\Windows\Documents\MSA_local\ComfyUI_local\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1532, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\Windows\Documents\MSA_local\ComfyUI_local\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1541, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\Windows\Documents\MSA_local\ComfyUI_local\ComfyUI\comfy\clip_model.py", line 19, in forward
v = self.v_proj(x)
^^^^^^^^^^^^^^
File "C:\Users\Windows\Documents\MSA_local\ComfyUI_local\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1532, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\Windows\Documents\MSA_local\ComfyUI_local\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1541, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\Windows\Documents\MSA_local\ComfyUI_local\ComfyUI\comfy\ops.py", line 76, in forward
return self.forward_comfy_cast_weights(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\Windows\Documents\MSA_local\ComfyUI_local\ComfyUI\comfy\ops.py", line 72, in forward_comfy_cast_weights
return torch.nn.functional.linear(input, weight, bias)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
torch.cuda.OutOfMemoryError: Allocation on device
Got an OOM, unloading all loaded models.
Hi @cubiq , I was wondering if you had any further thoughts on the memory increase with frame count, and if you could expand what you meant by the 'batch option'. Thx
I can try to force some cache cleaning, I actually disabled that a while back because sampling turned out slower so it's a kinda trade off. Maybe it could be activated automatically only over a certain number of frames or I could offer the option to do it.
I made some tests in the past and I can do 600 frames with 4 IPA no problem, I don't think I ever tested more than that.
Do you have a simple workflow without extra nodes except IPA and AD that triggers the OOM?
have you tried this option? set it to like 200
Oh I didn't see this! Thanks, yes this actually solves it! (I'm now running into OOM on KSampler, but I should be able to work around that, thanks!)
When I try to export a long animation with animatediff (say 300 frames) I'm running OOM on VRAM due to IPAdapters I'm using. Animatediff uses overlapping context windows of only 16 frames long. Is there a way for IPAdapterPlus to not process the entire 300 frames at once but also use rolling context windows?
FWIW I'm using 4x
IPAdapter Batch (Adv)
similar to this workflow https://www.youtube.com/watch?v=jc65n-viEEU2x of them are for one region of an image (with attn mask) fading between different ref images, and the other 2x are for another region of the image (with attn mask) fading between different ref images. I actually wanted to add more (i.e. for different image regions), but I'm very quickly running out of VRAM due to animation length!