InternLM / InternLM-XComposer

InternLM-XComposer2 is a groundbreaking vision-language large model (VLLM) excelling in free-form text-image composition and comprehension.
1.91k stars 120 forks source link

example_chat.py RuntimeError: The size of tensor a (1377) must match the size of tensor b (1376) at non-singleton dimension 3 #339

Closed lilichu closed 3 days ago

lilichu commented 3 days ago

run example_chat.py error:
File "/usr/local/app/.cache/huggingface/modules/transformers_modules/internlm/internlm-xcomposer2-vl-7b/c67bd06390dbe068a582c6561570725b1289a7c5/modeling_internlm2.py", line 884, in forward attention_mask = self._prepare_decoder_attention_mask( File "/usr/local/app/.cache/huggingface/modules/transformers_modules/internlm/internlm-xcomposer2-vl-7b/c67bd06390dbe068a582c6561570725b1289a7c5/modeling_internlm2.py", line 820, in _prepare_decoder_attention_mask expanded_attn_mask + combined_attention_mask) RuntimeError: The size of tensor a (1377) must match the size of tensor b (1376) at non-singleton dimension 3

什么都没有改动,直接执行example_chat.py就会出现这个问题。如果imageis None(input_ids)可以跑,但是传图片(input_embeds)就会失败

image
lilichu commented 3 days ago

solution: I use transformers 4.30.2, update to transformers==4.33.2