huggingface / transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
https://huggingface.co/transformers
Apache License 2.0
130.84k stars 26.03k forks source link

Moondream breaks on transformers 4.42+ #31782

Closed pranay-ar closed 3 weeks ago

pranay-ar commented 1 month ago

System Info

Who can help?

@amyeroberts

Information

Tasks

Reproduction

from transformers import AutoModelForCausalLM, AutoTokenizer
from PIL import Image

model_id = "vikhyatk/moondream2"
revision = "2024-05-20"
model = AutoModelForCausalLM.from_pretrained(
    model_id, trust_remote_code=True, revision=revision
)
tokenizer = AutoTokenizer.from_pretrained(model_id, revision=revision)

image = Image.open('./seattle.jpg')
enc_image = model.encode_image(image)
print(model.answer_question(enc_image, "Describe this image.", tokenizer))

Expected behavior

Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "/home/pranay/GRID-Core/grid/model/perception/vlm/moondream.py", line 93, in <module>
    sanity_check_moondream()
  File "/home/pranay/GRID-Core/grid/model/perception/vlm/moondream.py", line 21, in sanity_check_moondream
    outputs = moondream.run(img, "What objects do you see?")
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/pranay/GRID-Core/grid/model/model.py", line 14, in wrapped_method
    return method(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/pranay/miniconda3/envs/grid/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/home/pranay/GRID-Core/grid/model/perception/vlm/moondream.py", line 87, in run
    answer = MoonDream._static_model.answer_question(enc_image, question, MoonDream._static_tokenizer)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/pranay/.cache/huggingface/modules/transformers_modules/af5f98991c2661d0c4448ff6c2fb3c19e74f1c02/moondream.py", line 96, in answer_question
    answer = self.generate(
             ^^^^^^^^^^^^^^
  File "/home/pranay/.cache/huggingface/modules/transformers_modules/af5f98991c2661d0c4448ff6c2fb3c19e74f1c02/moondream.py", line 80, in generate
    output_ids = self.text_model.generate(
                 ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/pranay/miniconda3/envs/grid/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/home/pranay/miniconda3/envs/grid/lib/python3.11/site-packages/transformers/generation/utils.py", line 1914, in generate
    result = self._sample(
             ^^^^^^^^^^^^^
  File "/home/pranay/miniconda3/envs/grid/lib/python3.11/site-packages/transformers/generation/utils.py", line 2651, in _sample
    outputs = self(
              ^^^^^
  File "/home/pranay/miniconda3/envs/grid/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/pranay/miniconda3/envs/grid/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/pranay/.cache/huggingface/modules/transformers_modules/af5f98991c2661d0c4448ff6c2fb3c19e74f1c02/modeling_phi.py", line 1074, in forward
    outputs = self.transformer(
              ^^^^^^^^^^^^^^^^^
  File "/home/pranay/miniconda3/envs/grid/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/pranay/miniconda3/envs/grid/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/pranay/.cache/huggingface/modules/transformers_modules/af5f98991c2661d0c4448ff6c2fb3c19e74f1c02/modeling_phi.py", line 929, in forward
    layer_outputs = decoder_layer(
                    ^^^^^^^^^^^^^^
  File "/home/pranay/miniconda3/envs/grid/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/pranay/miniconda3/envs/grid/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/pranay/.cache/huggingface/modules/transformers_modules/af5f98991c2661d0c4448ff6c2fb3c19e74f1c02/modeling_phi.py", line 733, in forward
    attn_outputs, self_attn_weights, present_key_value = self.mixer(
                                                         ^^^^^^^^^^^
  File "/home/pranay/miniconda3/envs/grid/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/pranay/miniconda3/envs/grid/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/pranay/.cache/huggingface/modules/transformers_modules/af5f98991c2661d0c4448ff6c2fb3c19e74f1c02/modeling_phi.py", line 382, in forward
    query_rot, key_rot = apply_rotary_pos_emb(
                         ^^^^^^^^^^^^^^^^^^^^^
  File "/home/pranay/.cache/huggingface/modules/transformers_modules/af5f98991c2661d0c4448ff6c2fb3c19e74f1c02/modeling_phi.py", line 214, in apply_rotary_pos_emb
    cos = cos[position_ids].unsqueeze(unsqueeze_dim)
          ~~~^^^^^^^^^^^^^^
IndexError: index is out of bounds for dimension with size 0
LysandreJik commented 1 month ago

Hello! You don't get this error in v4.41.0?

Moondream is not maintained by us unfortunately, but we'd be happy to see what change in transformers resulted in the break to see how we could fix it.

pranay-ar commented 1 month ago

Yes, the code works perfectly fine until 4.41.2. It only breaks from the 4.42.0+ releases!

hypernovas commented 1 month ago

Yes, the code works perfectly fine until 4.41.2. It only breaks from the 4.42.0+ releases!

Hi, @pranay-ar curious if you find any fix on this?

pranay-ar commented 1 month ago

Hi @hypernovas as per the author of Moondream, there is a new version coming up soon, so I thought of waiting for that one and reverted back to 4.41.2 for my use-case.

hypernovas commented 1 month ago

@pranay-ar Thanks! I am actually getting this problem with lower transformers version, accelerate==0.25.0 huggingface-hub==0.20.1 Pillow==10.1.0 torch==2.1.2 torchvision==0.16.2 transformers==4.36.2 einops==0.7.0 gradio==4.15.0 flash-attn==2.5.8

Do you mind share what are the lib versions work for you? Really appreciate that. I am thinking one of their dependencies or other libs got upgraded. And maybe you also don't wanna to upgrade any libs ;)

pranay-ar commented 1 month ago

Hey @hypernovas sorry for the late reply, I found the fix and it was merged into main branch today, can you try again and let me know if you're still facing that issue!

These are my library versions for your reference:

accelerate==0.31.0 huggingface-hub==0.23.3 Pillow==10.3.0 torch==2.2.0 torchvision==0.17.0 transformers==4.42.4 einops==0.8.0

edmondja commented 1 month ago

Hello, fyi I have a similar problem with a similar VLM I made using phi 3 mini. Using input_embeds with model.generate I have the error (I dont have it with 4.41.1) : File "/opt/conda/envs/ptca/lib/python3.10/site-packages/transformers/generation/utils.py", line 1914, in generate result = self._sample( File "/opt/conda/envs/ptca/lib/python3.10/site-packages/transformers/generation/utils.py", line 2651, in _sample outputs = self( File "/opt/conda/envs/ptca/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl return self._call_impl(*args, kwargs) File "/opt/conda/envs/ptca/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl return forward_call(*args, *kwargs) File "/root/.cache/huggingface/modules/transformers_modules/microsoft/Phi-3-mini-4k-instruct/c1358f8a35e6d2af81890deffbbfa575b978c62f/modeling_phi3.py", line 1243, in forward outputs = self.model( File "/opt/conda/envs/ptca/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl return self._call_impl(args, kwargs) File "/opt/conda/envs/ptca/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl return forward_call(*args, **kwargs) File "/root/.cache/huggingface/modules/transformers_modules/microsoft/Phi-3-mini-4k-instruct/c1358f8a35e6d2af81890deffbbfa575b978c62f/modeling_phi3.py", line 1072, in forward position_ids = position_ids.view(-1, seq_length).long() RuntimeError: shape '[-1, 0]' is invalid for input of size 50