allenai / scifact

Data and models for the SciFact verification task.
Other
215 stars 24 forks source link

Prediction error on the test set #13

Closed MHDBST closed 3 years ago

MHDBST commented 3 years ago

Hi all, I have installed the package completely an downloaded the data file as well. I can do the prediction on the dev set successfully with this command:

./script/label-prediction.sh scibert scifact dev

I get the following results:

Accuracy           0.6916 
Macro F1:          0.6634 
Macro F1 w/o NEI:  0.5921

                   [C      N      S     ]
F1:                [0.4672 0.806  0.7171]
Precision:         [0.4848 0.9101 0.6566]
Recall:            [0.4507 0.7232 0.7899]

Confusion Matrix:
[[ 32   5  34]
 [  8  81  23]
 [ 26   3 109]] 

But when I run it on the test set, I get this error:

./script/label-prediction.sh scibert scifact test

Error:

Retrieving oracle abstracts.
Traceback (most recent call last):
  File "verisci/inference/abstract_retrieval/oracle.py", line 14, in <module>
    doc_ids = list(map(int, data['evidence'].keys()))
KeyError: 'evidence'

Selecting oracle rationales.

Predicting labels.
claim_and_rationale
Using device "cpu"
0it [00:00, ?it/s]

Evaluating.
Traceback (most recent call last):
  File "verisci/evaluate/label_prediction.py", line 40, in <module>
    print(f'Accuracy           {round(sum([pred_labels[i] == true_labels[i] for i in range(len(pred_labels))]) / len(pred_labels), 4)}')
ZeroDivisionError: division by zero

When I try to skip the part that it writes the evaluation results, the prediction array is empty. How can I generate the predictions without having the actual labels?

dwadden commented 3 years ago

Good point, I'll update the README to indicate that the label-prediction script will only work on the dev set. To do full-pipeline prediction and evaluation, use pipeline.sh. I just updated it so that it won't attempt to do evaluation if you're making test set predictions.

Closing now, let me know if this doesn't work for you.

MHDBST commented 3 years ago

I'm still facing the same error. It's looking for the 'evidence' key:

Retrieving abstracts.
Traceback (most recent call last):
  File "verisci/inference/abstract_retrieval/oracle.py", line 14, in <module>
    doc_ids = list(map(int, data['evidence'].keys()))
KeyError: 'evidence'
dwadden commented 3 years ago

This is actually expected, you can't run oracle abstract retrieval on the test set because the oracle setting requires being given gold documents, which you don't have. You should run in the open setting instead. I'll update the README to clarify.