DualCLIPLoader (GGUF) ---> torch.OutOfMemoryError: Allocation on device

gkiryaziev commented 3 months ago

Issue with DualCLIPLoader (GGUF), but with default DualCLIPLoader works fine.

2024-08-22_214622

WAS Node Suite: TextBatch Index: 1
Requested to load FluxClipModel_
Loading 1 new model
loaded completely 0.0 4778.66552734375 True
!!! Exception during processing !!! Allocation on device
Traceback (most recent call last):
  File "C:\AI\ComfyUI2\execution.py", line 317, in execute
    output_data, output_ui, has_subgraph = get_output_data(obj, input_data_all, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb)
                                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\AI\ComfyUI2\execution.py", line 192, in get_output_data
    return_values = _map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\AI\ComfyUI2\execution.py", line 169, in _map_node_over_list
    process_inputs(input_dict, i)
  File "C:\AI\ComfyUI2\execution.py", line 158, in process_inputs
    results.append(getattr(obj, func)(**inputs))
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\AI\ComfyUI2\comfy_extras\nodes_flux.py", line 21, in encode
    output = clip.encode_from_tokens(tokens, return_pooled=True, return_dict=True)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\AI\ComfyUI2\comfy\sd.py", line 126, in encode_from_tokens
    o = self.cond_stage_model.encode_token_weights(tokens)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\AI\ComfyUI2\comfy\text_encoders\flux.py", line 57, in encode_token_weights
    t5_out, t5_pooled = self.t5xxl.encode_token_weights(token_weight_pairs_t5)
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\AI\ComfyUI2\comfy\sd1_clip.py", line 41, in encode_token_weights
    o = self.encode(to_encode)
        ^^^^^^^^^^^^^^^^^^^^^^
  File "C:\AI\ComfyUI2\comfy\sd1_clip.py", line 229, in encode
    return self(tokens)
           ^^^^^^^^^^^^
  File "C:\AI\venv\Lib\site-packages\torch\nn\modules\module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\AI\venv\Lib\site-packages\torch\nn\modules\module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\AI\ComfyUI2\comfy\sd1_clip.py", line 201, in forward
    outputs = self.transformer(tokens, attention_mask_model, intermediate_output=self.layer_idx, final_layer_norm_intermediate=self.layer_norm_hidden_state, dtype=torch.float32)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\AI\venv\Lib\site-packages\torch\nn\modules\module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\AI\venv\Lib\site-packages\torch\nn\modules\module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\AI\ComfyUI2\comfy\text_encoders\t5.py", line 238, in forward
    x = self.shared(input_ids, out_dtype=kwargs.get("dtype", torch.float32))
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\AI\venv\Lib\site-packages\torch\nn\modules\module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\AI\venv\Lib\site-packages\torch\nn\modules\module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\AI\ComfyUI2\comfy\ops.py", line 202, in forward
    return self.forward_comfy_cast_weights(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\AI\ComfyUI2\comfy\ops.py", line 197, in forward_comfy_cast_weights
    weight, bias = cast_bias_weight(self, device=input.device, dtype=out_dtype)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\AI\ComfyUI2\comfy\ops.py", line 46, in cast_bias_weight
    weight = cast_to(s.weight, dtype, device, non_blocking=non_blocking)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\AI\ComfyUI2\comfy\ops.py", line 26, in cast_to
    r = torch.empty_like(weight, dtype=dtype, device=device)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
torch.OutOfMemoryError: Allocation on device

city96 commented 3 months ago

Looks like recent ComfyUI changes increased RAM usage by a lot. Most likely related to https://github.com/city96/ComfyUI-GGUF/issues/57 as well.

al-swaiti commented 2 months ago

try using python main.py --lowvram

city96 commented 2 months ago

Forgot about this but possibly fixed by https://github.com/city96/ComfyUI-GGUF/commit/454955ead3336322215a206edbd7191eb130bba0 ...?

city96 / ComfyUI-GGUF

DualCLIPLoader (GGUF) ---> torch.OutOfMemoryError: Allocation on device #60