cubiq / ComfyUI_IPAdapter_plus

GNU General Public License v3.0
3.81k stars 288 forks source link

OOM on long animation sequence batch #692

Open memo opened 2 weeks ago

memo commented 2 weeks ago

When I try to export a long animation with animatediff (say 300 frames) I'm running OOM on VRAM due to IPAdapters I'm using. Animatediff uses overlapping context windows of only 16 frames long. Is there a way for IPAdapterPlus to not process the entire 300 frames at once but also use rolling context windows?

FWIW I'm using 4x IPAdapter Batch (Adv) similar to this workflow https://www.youtube.com/watch?v=jc65n-viEEU

2x of them are for one region of an image (with attn mask) fading between different ref images, and the other 2x are for another region of the image (with attn mask) fading between different ref images. I actually wanted to add more (i.e. for different image regions), but I'm very quickly running out of VRAM due to animation length!

cubiq commented 2 weeks ago

ipadapter is already offloading what is not needed, the problem shouldn't be much not the number of frames but the number of ipadapters. If you get OOM on the ipadapter node then you can try to the batch option that will encode the frames in chucks instead of all at the same time

memo commented 2 weeks ago

Thanks for the quick reply Matteo, and for the wonderful extension and very informative videos!

I'm sorry but I don't fully understand what you mean by this: then you can try to the batch option that will encode the frames in chucks instead of all at the same time

FYI When I run my workflow (1280x800, 15fps) on my 24GB 4090, I can run it to the end at 300 Frames no problem. I start getting OOM in the IPAdapters at 800 Frames. at 400-800 Frames the IPAdapters run fine, but they use so much VRAM that I get OOM in the KSampler.

Below is the full console dump of the OOM, and VRAM usage for different frame counts (in top left of each screenshot). The VRAM figures downstream of the node that OOM should be ignored as they are from the previous run. Before each run, I restarted comfyui and confirmed VRAM usage was at 0 in nvidia-smi.

I also made a table, because why not :) https://docs.google.com/spreadsheets/d/1G-uepKBogC0nJuAd5u2RrMzA-yT7H97rDlOXgvfxddQ/edit?usp=sharing

image

ComfyUI 100F ComfyUI 200F ComfyUI 400F ComfyUI 600F ComfyUI 800F ComfyUI 1000F
!!! Exception during processing !!! Allocation on device
Traceback (most recent call last):
  File "C:\Users\Windows\Documents\MSA_local\ComfyUI_local\ComfyUI\execution.py", line 317, in execute
    output_data, output_ui, has_subgraph = get_output_data(obj, input_data_all, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb)
                                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Windows\Documents\MSA_local\ComfyUI_local\ComfyUI\execution.py", line 192, in get_output_data
    return_values = _map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Windows\Documents\MSA_local\ComfyUI_local\ComfyUI\execution.py", line 169, in _map_node_over_list
    process_inputs(input_dict, i)
  File "C:\Users\Windows\Documents\MSA_local\ComfyUI_local\ComfyUI\execution.py", line 158, in process_inputs
    results.append(getattr(obj, func)(**inputs))
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Windows\Documents\MSA_local\ComfyUI_local\ComfyUI\custom_nodes\ComfyUI_IPAdapter_plus\IPAdapterPlus.py", line 822, in apply_ipadapter
    work_model, face_image = ipadapter_execute(work_model, ipadapter_model, clip_vision, **ipa_args)
                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Windows\Documents\MSA_local\ComfyUI_local\ComfyUI\custom_nodes\ComfyUI_IPAdapter_plus\IPAdapterPlus.py", line 359, in ipadapter_execute
    img_cond_embeds = encode_image_masked(clipvision, image, batch_size=encode_batch_size, tiles=enhance_tiles, ratio=enhance_ratio, clipvision_size=clipvision_size)
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Windows\Documents\MSA_local\ComfyUI_local\ComfyUI\custom_nodes\ComfyUI_IPAdapter_plus\utils.py", line 242, in encode_image_masked
    embeds = encode_image_masked_(clip_vision, image, mask, batch_size, clipvision_size=clipvision_size)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Windows\Documents\MSA_local\ComfyUI_local\ComfyUI\custom_nodes\ComfyUI_IPAdapter_plus\utils.py", line 299, in encode_image_masked_
    out = clip_vision.model(pixel_values=pixel_values, intermediate_output=-2)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Windows\Documents\MSA_local\ComfyUI_local\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1532, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Windows\Documents\MSA_local\ComfyUI_local\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1541, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Windows\Documents\MSA_local\ComfyUI_local\ComfyUI\comfy\clip_model.py", line 194, in forward
    x = self.vision_model(*args, **kwargs)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Windows\Documents\MSA_local\ComfyUI_local\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1532, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Windows\Documents\MSA_local\ComfyUI_local\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1541, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Windows\Documents\MSA_local\ComfyUI_local\ComfyUI\comfy\clip_model.py", line 183, in forward
    x, i = self.encoder(x, mask=None, intermediate_output=intermediate_output)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Windows\Documents\MSA_local\ComfyUI_local\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1532, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Windows\Documents\MSA_local\ComfyUI_local\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1541, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Windows\Documents\MSA_local\ComfyUI_local\ComfyUI\comfy\clip_model.py", line 69, in forward
    x = l(x, mask, optimized_attention)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Windows\Documents\MSA_local\ComfyUI_local\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1532, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Windows\Documents\MSA_local\ComfyUI_local\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1541, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Windows\Documents\MSA_local\ComfyUI_local\ComfyUI\comfy\clip_model.py", line 50, in forward
    x += self.self_attn(self.layer_norm1(x), mask, optimized_attention)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Windows\Documents\MSA_local\ComfyUI_local\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1532, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Windows\Documents\MSA_local\ComfyUI_local\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1541, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Windows\Documents\MSA_local\ComfyUI_local\ComfyUI\comfy\clip_model.py", line 19, in forward
    v = self.v_proj(x)
        ^^^^^^^^^^^^^^
  File "C:\Users\Windows\Documents\MSA_local\ComfyUI_local\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1532, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Windows\Documents\MSA_local\ComfyUI_local\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1541, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Windows\Documents\MSA_local\ComfyUI_local\ComfyUI\comfy\ops.py", line 76, in forward
    return self.forward_comfy_cast_weights(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Windows\Documents\MSA_local\ComfyUI_local\ComfyUI\comfy\ops.py", line 72, in forward_comfy_cast_weights
    return torch.nn.functional.linear(input, weight, bias)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
torch.cuda.OutOfMemoryError: Allocation on device

Got an OOM, unloading all loaded models.
memo commented 1 week ago

Hi @cubiq , I was wondering if you had any further thoughts on the memory increase with frame count, and if you could expand what you meant by the 'batch option'. Thx

cubiq commented 1 week ago

I can try to force some cache cleaning, I actually disabled that a while back because sampling turned out slower so it's a kinda trade off. Maybe it could be activated automatically only over a certain number of frames or I could offer the option to do it.

I made some tests in the past and I can do 600 frames with 4 IPA no problem, I don't think I ever tested more than that.

Do you have a simple workflow without extra nodes except IPA and AD that triggers the OOM?

cubiq commented 1 week ago

have you tried this option? set it to like 200 image

memo commented 1 week ago

Oh I didn't see this! Thanks, yes this actually solves it! (I'm now running into OOM on KSampler, but I should be able to work around that, thanks!)