microsoft / SwinBERT

Research code for CVPR 2022 paper "SwinBERT: End-to-End Transformers with Sparse Attention for Video Captioning"
https://arxiv.org/abs/2111.13196
MIT License
237 stars 35 forks source link

attention_scores= attention_scores + attention_mask #52

Open Xiyu-AI opened 11 months ago

Xiyu-AI commented 11 months ago

Thanks for your contributions. When I train model based on the setting: --max_seq_length 30 --max_seq_a_length 30 --max_img_seq_length 18, the error i get: attention_scores= attention_scores + attention_mask RuntimeError: The size of tensor a (471) must match the size of tensor b (48) at non-singleton dimension 3

I'm confused. I don't know what the problem is....

Could you please help me solve this problem? thank you

Xiyu-AI commented 11 months ago

Thanks for your contributions. When I train model based on the setting: --max_seq_length 30 --max_seq_a_length 30 --max_img_seq_length 18, the error i get: attention_scores= attention_scores + attention_mask RuntimeError: The size of tensor a (471) must match the size of tensor b (48) at non-singleton dimension 3

I'm confused. I don't know what the problem is....

Could you please help me solve this problem? thank you

I solved the problem......

6cb5f2dc9f2f63522eb3029811bcfb8