ValkyriaLenneth / Longformer_ZH

31 stars 2 forks source link

Sequence length should be multiple of 512. It can't directly used for encoding #1

Open SouthWindShiB opened 2 years ago

SouthWindShiB commented 2 years ago

File "D:\Anaconda\envs\torch_1.7\lib\site-packages\torch\nn\modules\module.py", line 727, in _call_impl result = self.forward(*input, kwargs) File "D:\Anaconda\envs\torch_1.7\lib\site-packages\transformers\models\bert\modeling_bert.py", line 1068, in forward return_dict=return_dict, File "D:\Anaconda\envs\torch_1.7\lib\site-packages\torch\nn\modules\module.py", line 727, in _call_impl result = self.forward(*input, *kwargs) File "D:\Anaconda\envs\torch_1.7\lib\site-packages\transformers\models\bert\modeling_bert.py", line 591, in forward output_attentions, File "D:\Anaconda\envs\torch_1.7\lib\site-packages\torch\nn\modules\module.py", line 727, in _call_impl result = self.forward(input, kwargs) File "D:\Anaconda\envs\torch_1.7\lib\site-packages\transformers\models\bert\modeling_bert.py", line 476, in forward past_key_value=self_attn_past_key_value, File "D:\Anaconda\envs\torch_1.7\lib\site-packages\torch\nn\modules\module.py", line 727, in _call_impl result = self.forward(*input, *kwargs) File "D:\Anaconda\envs\torch_1.7\lib\site-packages\transformers\models\bert\modeling_bert.py", line 408, in forward output_attentions, File "D:\Anaconda\envs\torch_1.7\lib\site-packages\torch\nn\modules\module.py", line 727, in _call_impl result = self.forward(input, *kwargs) File "I:\PycharmProject\zh_efficient-autogressive-EL\model\Longformer_zh.py", line 21, in forward output_attentions=output_attentions) File "D:\Anaconda\envs\torch_1.7\lib\site-packages\transformers\models\longformer\modeling_longformer.py", line 591, in forward query_vectors, key_vectors, self.one_sided_attn_window_size File "D:\Anaconda\envs\torch_1.7\lib\site-packages\transformers\models\longformer\modeling_longformer.py", line 803, in _sliding_chunks_query_key_matmul ), f"Sequence length should be multiple of {window_overlap 2}. Given {seq_len}" AssertionError: Sequence length should be multiple of 512. Given 158

did you miss something that pad the sequence to suitbal length?

ValkyriaLenneth commented 2 years ago

The structure of Longformer Attention Windows makes your input sequence length must be the multiple of windows length. To use it, you can pad your input sequence to 512 or 1024 and give the model correct input attention mask.