vikhyat / moondream

tiny vision language model
https://moondream.ai
Apache License 2.0
4.88k stars 433 forks source link

Index out of bounds when used in Open Interpreter #123

Open MikeBirdTech opened 1 month ago

MikeBirdTech commented 1 month ago

Trying to have Open Interpreter describe images locally. Errors every time. I use a Mac with Apple silicon

Not sure if the issue is with how Open Interpreter is passing images to moondream file or something with my setup.

Happy to help resolve this if it's something on our end!

  File ~/Library/Python/3.11/lib/python/site-packages/torch/nn/modules/module.py:1532, in Module._wrapped_call_impl(self, *args, **kwargs)
     1530     return self._compiled_call_impl(*args, **kwargs)  # type: ignore
     1531 else:
  -> 1532     return self._call_impl(*args, **kwargs)

  File ~/Library/Python/3.11/lib/python/site-packages/torch/nn/modules/module.py:1541, in Module._call_impl(self, *args, **kwargs)
     1536 # If we don't have any hooks, we want to skip the rest of the logic in
     1537 # this function, and just call forward.
     1538 if not (self._backward_hooks or self._backward_pre_hooks or self._forward_hooks or self._forward_pre_hooks
     1539         or _global_backward_pre_hooks or _global_backward_hooks
     1540         or _global_forward_hooks or _global_forward_pre_hooks):
  -> 1541     return forward_call(*args, **kwargs)
     1543 try:
     1544     result = None

  File ~/.cache/huggingface/modules/transformers_modules/vikhyatk/moondream2/9ba2958f5a886de83fa18a235d651295a05b4d13/modeling_phi.py:382, in PhiAttention.forward(self,
  hidden_states, attention_mask, position_ids, past_key_value, output_attentions, use_cache)
      377 key_rot, key_pass = (
      378     key_states[..., : self.rotary_emb.dim],
      379     key_states[..., self.rotary_emb.dim :],
      380 )
      381 #
  --> 382 query_rot, key_rot = apply_rotary_pos_emb(
      383     query_rot, key_rot, cos, sin, position_ids
      384 )
      386 #
      387 query_states = torch.cat((query_rot, query_pass), dim=-1)

  File ~/.cache/huggingface/modules/transformers_modules/vikhyatk/moondream2/9ba2958f5a886de83fa18a235d651295a05b4d13/modeling_phi.py:214, in apply_rotary_pos_emb(q, k,
  cos, sin, position_ids, unsqueeze_dim)
      193 def apply_rotary_pos_emb(q, k, cos, sin, position_ids, unsqueeze_dim=1):
      194     """Applies Rotary Position Embedding to the query and key tensors.
      195
      196     Args:
     (...)
      212         `tuple(torch.Tensor)` comprising of the query and key tensors rotated using the Rotary Position Embedding.
      213     """
  --> 214     cos = cos.unsqueeze(unsqueeze_dim)
      215     sin = sin.unsqueeze(unsqueeze_dim)
      216     q_embed = (q * cos) + (rotate_half(q) * sin)

  IndexError: index is out of bounds for dimension with size 0
tomprimozic commented 1 month ago

downgrading to transformers==4.41.2 works for me

related: https://github.com/huggingface/transformers/issues/32321

From what I see the Phi-model code is totally defined within moondream, while they are calling to the transformers implementation of generate(). I recommend you to open an issue in moondream since they would have to update Phi modeling code, we have been actively modifying it lately.

vikhyat commented 1 month ago

Updated here for compatibility with the latest version of transformers: https://github.com/vikhyat/moondream/commit/22565c070cc1bcbfca5a2f758d3e120b882a6e4b

Haven't pushed to HF yet - will do next week.

dsigmabcn commented 3 weeks ago

Hi!

Has it been pushed? I still have the out of bounds message.

Thanks!