X-PLUG / mPLUG-Owl

mPLUG-Owl: The Powerful Multi-modal Large Language Model Family
https://www.modelscope.cn/studios/damo/mPLUG-Owl
MIT License
2.32k stars 176 forks source link

Train Error: Attention mask shape #249

Open MiHongze-tju opened 1 month ago

MiHongze-tju commented 1 month ago

Hi,when I train mPLUG-Owl, it failed with error:

File "/home/pg/anaconda3/envs/owl2/lib/python3.10/site-packages/torch/autograd/function.py", line 506, in apply return forward_call(*args, kwargs) File "/home/pg/anaconda3/envs/owl2/lib/python3.10/site-packages/deepspeed/utils/nvtx.py", line 15, in wrapped_fn ret_val = func(*args, *kwargs) File "/home/pg/anaconda3/envs/owl2/lib/python3.10/site-packages/deepspeed/runtime/engine.py", line 1735, in forward return super().apply(args, kwargs) # type: ignore[misc] File "/home/pg/anaconda3/envs/owl2/lib/python3.10/site-packages/torch/utils/checkpoint.py", line 107, in forward outputs = run_function(args) File "/home/pg/mPLUG-Owl-main/mPLUG-Owl2/mplug_owl2/train/../model/modeling_llama2.py", line 323, in custom_forward return module(inputs, past_key_value, output_attentions) File "/home/pg/anaconda3/envs/owl2/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1538, in _call_impl loss = self.module(*inputs, kwargs) File "/home/pg/anaconda3/envs/owl2/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1538, in _call_impl loss = self.module(*inputs, *kwargs) File "/home/pg/anaconda3/envs/owl2/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1538, in _call_impl result = forward_call(args, kwargs) File "/home/pg/mPLUG-Owl-main/mPLUG-Owl2/mplug_owl2/train/../model/modeling_llama2.py", line 212, in forward hidden_states, self_attn_weights, present_key_value = self.self_attn( File "/home/pg/anaconda3/envs/owl2/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1538, in _call_impl outputs = model(inputs) File "/home/pg/anaconda3/envs/owl2/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl result = forward_call(*args, *kwargs) File "/home/pg/mPLUG-Owl-main/mPLUG-Owl2/mplug_owl2/train/../model/modeling_mplug_owl2.py", line 257, in forward result = forward_call(args, kwargs) File "/home/pg/mPLUG-Owl-main/mPLUG-Owl2/mplug_owl2/train/../model/modeling_mplug_owl2.py", line 257, in forward outputs = self.model( File "/home/pg/anaconda3/envs/owl2/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1538, in _call_impl return forward_call(*args, kwargs) File "/home/pg/anaconda3/envs/owl2/lib/python3.10/site-packages/deepspeed/utils/nvtx.py", line 15, in wrapped_fn ret_val = func(*args, *kwargs)result = forward_call(args, kwargs)

File "/home/pg/anaconda3/envs/owl2/lib/python3.10/site-packages/deepspeed/runtime/engine.py", line 1735, in forward File "/home/pg/mPLUG-Owl-main/mPLUG-Owl2/mplug_owl2/train/../model/modeling_llama2.py", line 139, in forward outputs = self.model( File "/home/pg/anaconda3/envs/owl2/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1538, in _call_impl result = forward_call(*args, **kwargs) File "/home/pg/mPLUG-Owl-main/mPLUG-Owl2/mplug_owl2/train/../model/modeling_llama2.py", line 327, in model_forward raise ValueError( ValueError: Attention mask should be of size (2, 1, 148, 148), but is torch.Size([2, 148])

How to deal with it?