NSTiwari / Fine-tune-IDEFICS-Vision-Language-Model

This repository demonstrates the data preparation and fine-tuning the IDEFICS Vision Language Model.
MIT License
17 stars 1 forks source link

Issue while inference #1

Open Kapil-Pathak opened 5 months ago

Kapil-Pathak commented 5 months ago

Hi, I am getting following error during inference after the training is completed.

File "v2_main.py", line 156, in generated_ids = model.generate(inputs, max_new_tokens=40) File "/home/ubuntu/.local/lib/python3.8/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context return func(*args, *kwargs) File "/home/ubuntu/.local/lib/python3.8/site-packages/transformers/generation/utils.py", line 1896, in generate result = self._sample( File "/home/ubuntu/.local/lib/python3.8/site-packages/transformers/generation/utils.py", line 2633, in _sample outputs = self( File "/home/ubuntu/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl return self._call_impl(args, kwargs) File "/home/ubuntu/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl return forward_call(*args, kwargs) File "/home/ubuntu/.local/lib/python3.8/site-packages/accelerate/hooks.py", line 166, in new_forward output = module._old_forward(*args, *kwargs) File "/home/ubuntu/.local/lib/python3.8/site-packages/transformers/models/idefics2/modeling_idefics2.py", line 1829, in forward outputs = self.model( File "/home/ubuntu/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl return self._call_impl(args, kwargs) File "/home/ubuntu/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl return forward_call(*args, *kwargs) File "/home/ubuntu/.local/lib/python3.8/site-packages/accelerate/hooks.py", line 166, in new_forward output = module._old_forward(args, **kwargs) File "/home/ubuntu/.local/lib/python3.8/site-packages/transformers/models/idefics2/modeling_idefics2.py", line 1656, in forward inputs_embeds = self.inputs_merger( File "/home/ubuntu/.local/lib/python3.8/site-packages/transformers/models/idefics2/modeling_idefics2.py", line 1542, in inputs_merger new_inputs_embeds[special_image_token_mask] = reshaped_image_hidden_states RuntimeError: shape mismatch: value tensor of shape [64, 4096] cannot be broadcast to indexing result of shape [0, 4096]

I am running the same code here without any change. Could you please mention library versions as well? Thanks

Tien5770 commented 1 month ago

Facing the same issue. Have you resolved it yet?