huggingface / transformers

πŸ€— Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
https://huggingface.co/transformers
Apache License 2.0
132.29k stars 26.35k forks source link

why squad.py did not reproduce squad1.1 report result? #4301

Closed yyHaker closed 4 years ago

yyHaker commented 4 years ago

πŸ“š Migration

Information

Model I am using (Bert, XLNet ...):

Language I am using the model on (English...):

The problem arises when using:

The tasks I am working on is:

Details

But I did not reproduce the result reported, the repository say get result bellow:

python $SQUAD_DIR/evaluate-v1.1.py $SQUAD_DIR/dev-v1.1.json ../models/wwm_uncased_finetuned_squad/predictions.json
{"exact_match": 86.91579943235573, "f1": 93.1532499015869}

my result is below:

python $SQUAD_DIR/evaluate-v1.1.py $SQUAD_DIR/dev-v1.1.json ../models/wwm_uncased_finetuned_squad/predictions.json
{"exact_match": 81.03, "f1": 88.02}

Environment info

Checklist

MagicFrogSJTU commented 4 years ago

I have just solved this problem. You have to set an additional flag: --do_lower_case. I wonder why the run_squad.py behaves differently than run_glue.py, etc. Is there is a code improve on the way?

LysandreJik commented 4 years ago

You shouldn't have to set --do_lower_case as it should be lowercased by default for that model.

MagicFrogSJTU commented 4 years ago

You shouldn't have to set --do_lower_case as it should be lowercased by default for that model.

I thought it is and it should be, but it isn't

julien-c commented 4 years ago

Closing this b/c #4245 was merged

(we still need to investigate why the lowercasing is not properly populated by the model's config)