plkmo / BERT-Relation-Extraction

PyTorch implementation for "Matching the Blanks: Distributional Similarity for Relation Learning" paper
Apache License 2.0
565 stars 132 forks source link

evaluation error #18

Open GhadaAlfattni opened 4 years ago

GhadaAlfattni commented 4 years ago

Hi

I was trying to run your code on another task. I have processed the data to be similar to semeval data. I have also amended the number of relations and the 2 files for evaluation with my relation types. However, I keep getting this error, any idea why?

[Epoch: 1, 3216/ 32185 points] total loss, accuracy per batch: 0.703, 0.733 [Epoch: 1, 6432/ 32185 points] total loss, accuracy per batch: 0.610, 0.774 [Epoch: 1, 9648/ 32185 points] total loss, accuracy per batch: 0.567, 0.794 [Epoch: 1, 12864/ 32185 points] total loss, accuracy per batch: 0.577, 0.780 [Epoch: 1, 16080/ 32185 points] total loss, accuracy per batch: 0.561, 0.785 [Epoch: 1, 19296/ 32185 points] total loss, accuracy per batch: 0.536, 0.793 [Epoch: 1, 22512/ 32185 points] total loss, accuracy per batch: 0.545, 0.799 [Epoch: 1, 25728/ 32185 points] total loss, accuracy per batch: 0.540, 0.792 [Epoch: 1, 28944/ 32185 points] total loss, accuracy per batch: 0.522, 0.799 [Epoch: 1, 32160/ 32185 points] total loss, accuracy per batch: 0.489, 0.810 06/21/2020 12:24:46 AM [INFO]: Evaluating test samples... 0%| | 1/1644 [00:15<7:00:27, 15.35s/it] Traceback (most recent call last): File "main_task.py", line 48, in net = train_and_fit(args) File "/Volumes/ghada/nlp/BERT-Relation-Extraction-master/src/tasks/trainer.py", line 159, in train_and_fit results = evaluate_results(net, test_loader, pad_id, cuda) File "/Volumes/ghada/nlp/BERT-Relation-Extraction-master/src/tasks/train_funcs.py", line 93, in evaluate_results e1_e2_start=e1_e2_start) File "/Users/ghada/opt/anaconda3/envs/relationBert/lib/python3.6/site-packages/torch/nn/modules/module.py", line 550, in call result = self.forward(*input, kwargs) File "/Volumes/ghada/nlp/BERT-Relation-Extraction-master/src/model/BERT/modeling_bert.py", line 734, in forward embedding_output = self.embeddings(input_ids=input_ids, position_ids=position_ids, token_type_ids=token_type_ids, inputs_embeds=inputs_embeds) File "/Users/ghada/opt/anaconda3/envs/relationBert/lib/python3.6/site-packages/torch/nn/modules/module.py", line 550, in call result = self.forward(*input, *kwargs) File "/Volumes/ghada/nlp/BERT-Relation-Extraction-master/src/model/BERT/modeling_bert.py", line 177, in forward position_embeddings = self.position_embeddings(position_ids) File "/Users/ghada/opt/anaconda3/envs/relationBert/lib/python3.6/site-packages/torch/nn/modules/module.py", line 550, in call result = self.forward(input, kwargs) File "/Users/ghada/opt/anaconda3/envs/relationBert/lib/python3.6/site-packages/torch/nn/modules/sparse.py", line 114, in forward self.norm_type, self.scale_grad_by_freq, self.sparse) File "/Users/ghada/opt/anaconda3/envs/relationBert/lib/python3.6/site-packages/torch/nn/functional.py", line 1724, in embedding return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse) IndexError: index out of range in self

plkmo commented 3 years ago

probably got to do with your test_loader outputs

VahidehReshadat commented 3 years ago

Hi

I was trying to run your code on another task. I have processed the data to be similar to semeval data. I have also amended the number of relations and the 2 files for evaluation with my relation types. However, I keep getting this error, any idea why?

[Epoch: 1, 3216/ 32185 points] total loss, accuracy per batch: 0.703, 0.733 [Epoch: 1, 6432/ 32185 points] total loss, accuracy per batch: 0.610, 0.774 [Epoch: 1, 9648/ 32185 points] total loss, accuracy per batch: 0.567, 0.794 [Epoch: 1, 12864/ 32185 points] total loss, accuracy per batch: 0.577, 0.780 [Epoch: 1, 16080/ 32185 points] total loss, accuracy per batch: 0.561, 0.785 [Epoch: 1, 19296/ 32185 points] total loss, accuracy per batch: 0.536, 0.793 [Epoch: 1, 22512/ 32185 points] total loss, accuracy per batch: 0.545, 0.799 [Epoch: 1, 25728/ 32185 points] total loss, accuracy per batch: 0.540, 0.792 [Epoch: 1, 28944/ 32185 points] total loss, accuracy per batch: 0.522, 0.799 [Epoch: 1, 32160/ 32185 points] total loss, accuracy per batch: 0.489, 0.810 06/21/2020 12:24:46 AM [INFO]: Evaluating test samples... 0%| | 1/1644 [00:15<7:00:27, 15.35s/it] Traceback (most recent call last): File "main_task.py", line 48, in net = train_and_fit(args) File "/Volumes/ghada/nlp/BERT-Relation-Extraction-master/src/tasks/trainer.py", line 159, in train_and_fit results = evaluate_results(net, test_loader, pad_id, cuda) File "/Volumes/ghada/nlp/BERT-Relation-Extraction-master/src/tasks/train_funcs.py", line 93, in evaluate_results e1_e2_start=e1_e2_start) File "/Users/ghada/opt/anaconda3/envs/relationBert/lib/python3.6/site-packages/torch/nn/modules/module.py", line 550, in call result = self.forward(*input, kwargs) File "/Volumes/ghada/nlp/BERT-Relation-Extraction-master/src/model/BERT/modeling_bert.py", line 734, in forward embedding_output = self.embeddings(input_ids=input_ids, position_ids=position_ids, token_type_ids=token_type_ids, inputs_embeds=inputs_embeds) File "/Users/ghada/opt/anaconda3/envs/relationBert/lib/python3.6/site-packages/torch/nn/modules/module.py", line 550, in call result = self.forward(*input, kwargs) File "/Volumes/ghada/nlp/BERT-Relation-Extraction-master/src/model/BERT/modeling_bert.py", line 177, in forward position_embeddings = self.position_embeddings(position_ids) File "/Users/ghada/opt/anaconda3/envs/relationBert/lib/python3.6/site-packages/torch/nn/modules/module.py", line 550, in call* result = self.forward(input, kwargs) File "/Users/ghada/opt/anaconda3/envs/relationBert/lib/python3.6/site-packages/torch/nn/modules/sparse.py", line 114, in forward self.norm_type, self.scale_grad_by_freq, self.sparse) File "/Users/ghada/opt/anaconda3/envs/relationBert/lib/python3.6/site-packages/torch/nn/functional.py", line 1724, in embedding return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse) IndexError: index out of range in self

It seems the number of relation types in your data is different from the default! notice that the sequence of the name entities for the same relation is important and is counted as a separate relation type as well

plkmo commented 3 years ago

If you want to run the task on your own data, do note that in preprocessing_funcs.py line 70: rm = Relations_Mapper(df_train['relations']), the relation classes are mapped using the train set only. So either you will need to ensure all relations classes are captured in the train dataset, or modify the code to ensure all relation classes in both train + test sets are captured. This may be causing your errors.