MinusZoneAI / ComfyUI-CogVideoX-MZ

CogVideoX-5B 4-bit quantization model
GNU General Public License v3.0
64 stars 4 forks source link

Given groups=1, weight of size [3072, 16, 2, 2], expected input[26, 32, 60, 90] to have 16 channels, but got 32 channels instead #21

Open jackkite opened 3 days ago

jackkite commented 3 days ago

got prompt Encoded latents shape: torch.Size([1, 1, 16, 60, 90]) Temporal tiling disabled 0%| | 0/50 [00:00<?, ?it/s] !!! Exception during processing !!! Given groups=1, weight of size [3072, 16, 2, 2], expected input[26, 32, 60, 90] to have 16 channels, but got 32 channels instead Traceback (most recent call last): File "E:\ComfyUI_windows_portable\ComfyUI\execution.py", line 323, in execute output_data, output_ui, has_subgraph = get_output_data(obj, input_data_all, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "E:\ComfyUI_windows_portable\ComfyUI\execution.py", line 198, in get_output_data return_values = _map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "E:\ComfyUI_windows_portable\ComfyUI\execution.py", line 169, in _map_node_over_list process_inputs(input_dict, i) File "E:\ComfyUI_windows_portable\ComfyUI\execution.py", line 158, in process_inputs results.append(getattr(obj, func)(inputs)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "E:\ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI-CogVideoXWrapper\nodes.py", line 843, in process latents = pipeline["pipe"]( ^^^^^^^^^^^^^^^^^ File "E:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context return func(*args, *kwargs) ^^^^^^^^^^^^^^^^^^^^^ File "E:\ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI-CogVideoX-MZ-main\pipeline_cogvideox.py", line 615, in call noise_pred = self.transformer( ^^^^^^^^^^^^^^^^^ File "E:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1532, in _wrapped_call_impl return self._call_impl(args, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "E:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1541, in _call_impl return forward_call(*args, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "E:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\diffusers\models\transformers\cogvideox_transformer_3d.py", line 446, in forward hidden_states = self.patch_embed(encoder_hidden_states, hidden_states) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "E:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1532, in _wrapped_call_impl return self._call_impl(*args, *kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "E:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1541, in _call_impl return forward_call(args, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "E:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\diffusers\models\embeddings.py", line 412, in forward image_embeds = self.proj(image_embeds) ^^^^^^^^^^^^^^^^^^^^^^^ File "E:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1532, in _wrapped_call_impl return self._call_impl(*args, *kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "E:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1541, in _call_impl return forward_call(args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "E:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\conv.py", line 460, in forward return self._conv_forward(input, self.weight, self.bias) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "E:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\conv.py", line 456, in _conv_forward return F.conv2d(input, weight, bias, self.stride, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ RuntimeError: Given groups=1, weight of size [3072, 16, 2, 2], expected input[26, 32, 60, 90] to have 16 channels, but got 32 channels instead

Prompt executed in 1.88 seconds 捕获

捕获

wailovet commented 2 days ago

Try turning off vae_encode_tiling.

jackkite commented 2 days ago

Try turning off vae_encode_tiling.

Still the same error ![Uploading 捕获.PNG…]()