allenai / longformer

Longformer: The Long-Document Transformer
https://arxiv.org/abs/2004.05150
Apache License 2.0
2.05k stars 276 forks source link

when fine-turn on QuacQA dataset,I meet RuntimeError: index out of range: Tried to access index 1 out of table with 0 rows. at /pytorch/aten/src/TH/generic/THTensorEvenMoreMath.cpp:418 #162

Open Antlerkeke opened 3 years ago

Antlerkeke commented 3 years ago

01/19/2021 10:26:56 - INFO - main - device: cpu n_gpu: 0, distributed training: False, 16-bits training: False 01/19/2021 10:26:56 - INFO - transformers.tokenization_utils - Model name 'longformer_finetuned-squadv1' not found in model shortcut name list (allenai/longformer-base-4096, allenai/longformer-large-4096, allenai/longformer-large-4096-finetuned-triviaqa, allenai/longformer-base-4096-extra.pos.embd.only, allenai/longformer-large-4096-extra.pos.embd.only). Assuming 'longformer_finetuned-squadv1' is a path, a model identifier, or url to a directory containing tokenizer files. 01/19/2021 10:26:56 - INFO - transformers.tokenization_utils - Didn't find file longformer_finetuned-squadv1/added_tokens.json. We won't load it. 01/19/2021 10:26:56 - INFO - transformers.tokenization_utils - Didn't find file longformer_finetuned-squadv1/special_tokens_map.json. We won't load it. 01/19/2021 10:26:56 - INFO - transformers.tokenization_utils - loading file longformer_finetuned-squadv1/vocab.json 01/19/2021 10:26:56 - INFO - transformers.tokenization_utils - loading file longformer_finetuned-squadv1/merges.txt 01/19/2021 10:26:56 - INFO - transformers.tokenization_utils - loading file None 01/19/2021 10:26:56 - INFO - transformers.tokenization_utils - loading file None 01/19/2021 10:26:56 - INFO - transformers.tokenization_utils - loading file longformer_finetuned-squadv1/tokenizer_config.json 01/19/2021 10:28:13 - INFO - transformers.configuration_utils - loading configuration file longformer_finetuned-squadv1/config.json 01/19/2021 10:28:13 - INFO - transformers.configuration_utils - Model config LongformerConfig { "attention_mode": "longformer", "attention_probs_dropout_prob": 0.1, "attention_window": [ 512, 512, 512, 512, 512, 512, 512, 512, 512, 512, 512, 512 ], "bos_token_id": 0, "cls_token_id": 0, "eos_token_id": 2, "gradient_checkpointing": false, "hidden_act": "gelu", "hidden_dropout_prob": 0.1, "hidden_size": 768, "ignore_attention_mask": false, "initializer_range": 0.02, "intermediate_size": 3072, "layer_norm_eps": 1e-05, "mask_token_id": 50264, "max_position_embeddings": 4098, "model_type": "longformer", "num_attention_heads": 12, "num_hidden_layers": 12, "pad_token_id": 1, "sep_token_id": 2, "type_vocab_size": 1, "unk_token_id": 3, "vocab_size": 50265 }

01/19/2021 10:28:13 - INFO - transformers.modeling_utils - loading weights file longformer_finetuned-squadv1/pytorch_model.bin 01/19/2021 10:28:19 - INFO - main - Running training 01/19/2021 10:28:19 - INFO - main - Batch size = 32 01/19/2021 10:28:19 - INFO - main - Num steps = 18686 Iteration: 0%| | 0/14949 [00:00<?, ?it/s]01/19/2021 10:28:19 - INFO - transformers.modeling_longformer - Initializing global attention on question tokens... 01/19/2021 10:28:19 - INFO - transformers.modeling_longformer - Input ids are automatically padded from 384 to 512 to be a multiple of config.attention_window: 512 Iteration: 0%| | 0/14949 [00:00<?, ?it/s] Traceback (most recent call last): File "/home/suke/Store/nlp/LongFormer/run_quacqa.py", line 1378, in main() File "/home/suke/Store/nlp/LongFormer/run_quacqa.py", line 1281, in main loss = model(input_ids,input_mask, None,segment_ids, None, None,start_positions, end_positions) File "/home/suke/Store/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 532, in call result = self.forward(*input, kwargs) File "/home/suke/Store/anaconda3/lib/python3.6/site-packages/transformers/modeling_longformer.py", line 983, in forward inputs_embeds=inputs_embeds, File "/home/suke/Store/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 532, in call result = self.forward(*input, *kwargs) File "/home/suke/Store/anaconda3/lib/python3.6/site-packages/transformers/modeling_longformer.py", line 673, in forward encoder_attention_mask=None, File "/home/suke/Store/anaconda3/lib/python3.6/site-packages/transformers/modeling_bert.py", line 727, in forward input_ids=input_ids, position_ids=position_ids, token_type_ids=token_type_ids, inputs_embeds=inputs_embeds File "/home/suke/Store/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 532, in call result = self.forward(input, kwargs) File "/home/suke/Store/anaconda3/lib/python3.6/site-packages/transformers/modeling_roberta.py", line 66, in forward input_ids, token_type_ids=token_type_ids, position_ids=position_ids, inputs_embeds=inputs_embeds File "/home/suke/Store/anaconda3/lib/python3.6/site-packages/transformers/modeling_bert.py", line 176, in forward token_type_embeddings = self.token_type_embeddings(token_type_ids) File "/home/suke/Store/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 532, in call result = self.forward(*input, **kwargs) File "/home/suke/Store/anaconda3/lib/python3.6/site-packages/torch/nn/modules/sparse.py", line 114, in forward self.norm_type, self.scale_grad_by_freq, self.sparse) File "/home/suke/Store/anaconda3/lib/python3.6/site-packages/torch/nn/functional.py", line 1484, in embedding return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse) RuntimeError: index out of range: Tried to access index 1 out of table with 0 rows. at /pytorch/aten/src/TH/generic/THTensorEvenMoreMath.cpp:418

Antlerkeke commented 3 years ago

why is max_position_embeddings set to 4098?