Moondream RuntimeError: expanded size of tensor (749) must match the existing size (750) at non-singleton dimension 1.

sjuxax commented 7 months ago

Tried the Moondream node and it says this:

ERROR:root:!!! Exception during processing !!!
ERROR:root:Traceback (most recent call last):
  File "/net/dj/code/clones/github.com/comfyanonymous/ComfyUI/execution.py", line 152, in recursive_execute
    output_data, output_ui = get_output_data(obj, input_data_all)
                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/net/dj/code/clones/github.com/comfyanonymous/ComfyUI/execution.py", line 82, in get_output_data
    return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/net/dj/code/clones/github.com/comfyanonymous/ComfyUI/custom_nodes/ComfyUI-0246/utils.py", line 381, in new_func
    res_value = old_func(*final_args, **kwargs)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/net/dj/code/clones/github.com/comfyanonymous/ComfyUI/execution.py", line 75, in map_node_over_list
    results.append(getattr(obj, func)(**slice_dict(input_data_all, i)))
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/net/dj/code/clones/github.com/comfyanonymous/ComfyUI/custom_nodes/ComfyUI_VLM_nodes/nodes/moondream_script.py", line 76, in answer_questions
    full_sentence = self.text_model.answer_question(image_embeds, question)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/net/dj/code/clones/github.com/comfyanonymous/ComfyUI/custom_nodes/ComfyUI_VLM_nodes/nodes/moondream/text_model.py", line 79, in answer_question
    answer = self.generate(
             ^^^^^^^^^^^^^^
  File "/net/dj/code/clones/github.com/comfyanonymous/ComfyUI/custom_nodes/ComfyUI_VLM_nodes/nodes/moondream/text_model.py", line 71, in generate
    output_ids = self.model.generate(
                 ^^^^^^^^^^^^^^^^^^^^
  File "/home/jeff/.virtualenvs/comfyui/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/home/jeff/.virtualenvs/comfyui/lib/python3.11/site-packages/transformers/generation/utils.py", line 1544, in generate
    return self.greedy_search(
           ^^^^^^^^^^^^^^^^^^^
  File "/home/jeff/.virtualenvs/comfyui/lib/python3.11/site-packages/transformers/generation/utils.py", line 2404, in greedy_search
    outputs = self(
              ^^^^^
  File "/home/jeff/.virtualenvs/comfyui/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jeff/.virtualenvs/comfyui/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/net/dj/code/clones/github.com/comfyanonymous/ComfyUI/custom_nodes/ComfyUI_VLM_nodes/nodes/moondream/phi/modeling_phi.py", line 992, in forward
    hidden_states = self.transformer(
                    ^^^^^^^^^^^^^^^^^
  File "/home/jeff/.virtualenvs/comfyui/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jeff/.virtualenvs/comfyui/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/net/dj/code/clones/github.com/comfyanonymous/ComfyUI/custom_nodes/ComfyUI_VLM_nodes/nodes/moondream/phi/modeling_phi.py", line 933, in forward
    hidden_states = layer(
                    ^^^^^^
  File "/home/jeff/.virtualenvs/comfyui/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jeff/.virtualenvs/comfyui/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/net/dj/code/clones/github.com/comfyanonymous/ComfyUI/custom_nodes/ComfyUI_VLM_nodes/nodes/moondream/phi/modeling_phi.py", line 734, in forward
    attn_outputs = self.mixer(
                   ^^^^^^^^^^^
  File "/home/jeff/.virtualenvs/comfyui/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jeff/.virtualenvs/comfyui/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/net/dj/code/clones/github.com/comfyanonymous/ComfyUI/custom_nodes/ComfyUI_VLM_nodes/nodes/moondream/phi/modeling_phi.py", line 688, in forward
    attn_output = self._forward_cross_attn(
                  ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/net/dj/code/clones/github.com/comfyanonymous/ComfyUI/custom_nodes/ComfyUI_VLM_nodes/nodes/moondream/phi/modeling_phi.py", line 664, in _forward_cross_attn
    return self.inner_cross_attn(
           ^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jeff/.virtualenvs/comfyui/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jeff/.virtualenvs/comfyui/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jeff/.virtualenvs/comfyui/lib/python3.11/site-packages/torch/amp/autocast_mode.py", line 16, in decorate_autocast
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/home/jeff/.virtualenvs/comfyui/lib/python3.11/site-packages/torch/amp/autocast_mode.py", line 16, in decorate_autocast
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/net/dj/code/clones/github.com/comfyanonymous/ComfyUI/custom_nodes/ComfyUI_VLM_nodes/nodes/moondream/phi/modeling_phi.py", line 462, in forward
    padding_mask.masked_fill_(key_padding_mask, 0.0)
RuntimeError: The expanded size of the tensor (749) must match the existing size (750) at non-singleton dimension 1.  Target sizes: [1, 749].  Tensor sizes: [1, 750]

gokayfem commented 7 months ago

yes im aware of this problem about moondream on other platforms (it works on linux and windows) , i couldnt solve it for mac users. i will look into this in near future.

chaincrafter commented 7 months ago

Same here, Ubuntu 22.04 comfyui up to date...

gokayfem commented 7 months ago

Same here, Ubuntu 22.04 comfyui up to date...

i wish i can reproduce same error but in colab(linux) or in my pc windows so i can iterate on this error but it works in both case.

i just tried it on colab

chaincrafter commented 7 months ago

i created a complete new conda env with python=3.11. cloned the official comfyui repo. cloned your repo and added the same nodes as you have.

File "/home/dev/ComfyUI/custom_nodes/ComfyUI_VLM_nodes/nodes/moondream/phi/modeling_phi.py", line 462, in forward padding_mask.masked_fill_(key_padding_mask, 0.0) RuntimeError: The expanded size of the tensor (748) must match the existing size (749) at non-singleton dimension 1. Target sizes: [1, 748]. Tensor sizes: [1, 749]

Same problem. Also its a standard ubuntu 22.04 install with a RTX4090.

julien-blanchon commented 7 months ago

That's very strange, this also happen to me with Ubuntu 22.04 and RTX 3090.

A quick fix is to replace:

 if key_padding_mask is not None:
            padding_mask = torch.full(
                (batch_size, seqlen), -10000.0, dtype=scores.dtype, device=scores.device
            )
            padding_mask.masked_fill_(key_padding_mask, 0.0)

            scores = scores + rearrange(padding_mask, "b s -> b 1 1 s")

With

if key_padding_mask is not None:
            padding_mask = torch.full(
                (batch_size, seqlen_k),
                -10000.0,
                dtype=scores.dtype,
                device=scores.device,
            )
            key_padding_mask = key_padding_mask[:, :seqlen_k]
            padding_mask.masked_fill_(key_padding_mask, 0.0)

            scores = scores + rearrange(padding_mask, "b s -> b 1 1 s")

At: https://github.com/gokayfem/ComfyUI_VLM_nodes/blob/a363153eb58d2599150db6310fe21615e7f01aea/nodes/moondream/phi/modeling_phi.py#L386-L392

gokayfem commented 7 months ago

thanks for the help. i added this fix to the repo. i also checked it didnt broke the working ones.

gokayfem / ComfyUI_VLM_nodes

Moondream RuntimeError: expanded size of tensor (749) must match the existing size (750) at non-singleton dimension 1. #28