kijai / ComfyUI-CogVideoXWrapper

335 stars 21 forks source link

Sizes of tensors must match except in dimension 2. Expected size 60 but got size 12 for tensor number 1 in the list. #34

Open phr00t opened 2 weeks ago

phr00t commented 2 weeks ago
!!! Exception during processing !!! Sizes of tensors must match except in dimension 2. Expected size 60 but got size 12 for tensor number 1 in the list.
Traceback (most recent call last):
  File "D:\ComfyUI_windows_portable\ComfyUI\execution.py", line 317, in execute
    output_data, output_ui, has_subgraph = get_output_data(obj, input_data_all, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb)
                                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\ComfyUI_windows_portable\ComfyUI\execution.py", line 192, in get_output_data
    return_values = _map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\ComfyUI_windows_portable\ComfyUI\execution.py", line 169, in _map_node_over_list
    process_inputs(input_dict, i)
  File "D:\ComfyUI_windows_portable\ComfyUI\execution.py", line 158, in process_inputs
    results.append(getattr(obj, func)(**inputs))
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI-CogVideoXWrapper\nodes.py", line 364, in decode
    frames = vae.decode(latents).sample
             ^^^^^^^^^^^^^^^^^^^
  File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\diffusers\utils\accelerate_utils.py", line 46, in wrapper
    return method(self, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\diffusers\models\autoencoders\autoencoder_kl_cogvideox.py", line 1153, in decode
    decoded = self._decode(z).sample
              ^^^^^^^^^^^^^^^
  File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\diffusers\models\autoencoders\autoencoder_kl_cogvideox.py", line 1112, in _decode
    return self.tiled_decode(z, return_dict=return_dict)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\diffusers\models\autoencoders\autoencoder_kl_cogvideox.py", line 1229, in tiled_decode
    tile = self.decoder(tile)
           ^^^^^^^^^^^^^^^^^^
  File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1511, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\diffusers\models\autoencoders\autoencoder_kl_cogvideox.py", line 851, in forward
    hidden_states = self.conv_in(sample)
                    ^^^^^^^^^^^^^^^^^^^^
  File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1511, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\diffusers\models\autoencoders\autoencoder_kl_cogvideox.py", line 134, in forward
    inputs = self.fake_context_parallel_forward(inputs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\diffusers\models\autoencoders\autoencoder_kl_cogvideox.py", line 126, in fake_context_parallel_forward
    inputs = torch.cat(cached_inputs + [inputs], dim=2)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: Sizes of tensors must match except in dimension 2. Expected size 60 but got size 12 for tensor number 1 in the list.

Prompt executed in 40.46 seconds

image

Looks like this error is related to trying to use the vae_tiling feature

Bellzs commented 2 weeks ago

i meet the same problem,did you resolve it?

Gerkinfeltser commented 2 weeks ago

Yup, same issue here. With the new comfyui error popup stuff, the following message gets displayed on the error (I'm also using the enable_vae_tiling). CogVideoDecode Sizes of tensors must match except in dimension 2. Expected size 60 but got size 12 for tensor number 1 in the list.