yyHaker commented 4 years ago

📚 Migration

Information

Model I am using (Bert, XLNet ...):

Language I am using the model on (English...):

The problem arises when using:

[x] the official example scripts: (give details below) examples/question-answering/run_squad.py
[x] my own modified scripts: (give details below) ''' CUDA_VISIBLE_DEVICES=5 python examples/question-answering//run_squad.py \ --model_type bert \ --model_name_or_path bert-large-uncased-whole-word-masking \ --do_train \ --do_eval \ --data_dir EKMRC/data/squad1.1 \ --train_file train-v1.1.json \ --predict_file dev-v1.1.json \ --per_gpu_eval_batch_size=4 \ --per_gpu_train_batch_size=4 \ --gradient_accumulation_steps=6 \ --save_steps 3682 \ --learning_rate 3e-5 \ --num_train_epochs 2 \ --max_seq_length 384 \ --doc_stride 128 \ --output_dir result/debug_squad/wwm_uncased_bert_large_finetuned_squad/ \ --overwrite_output_dir '''

The tasks I am working on is:

[x] an official GLUE/SQUaD task: (give the name)

Details

But I did not reproduce the result reported, the repository say get result bellow:

python $SQUAD_DIR/evaluate-v1.1.py $SQUAD_DIR/dev-v1.1.json ../models/wwm_uncased_finetuned_squad/predictions.json
{"exact_match": 86.91579943235573, "f1": 93.1532499015869}

my result is below:

python $SQUAD_DIR/evaluate-v1.1.py $SQUAD_DIR/dev-v1.1.json ../models/wwm_uncased_finetuned_squad/predictions.json
{"exact_match": 81.03, "f1": 88.02}

Environment info

transformers version:
Platform: Linux gpu19 3.10.0-1062.4.1.el7.x86_64 #1 SMP Fri Oct 18 17:15:30 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
Python version: python3.6
PyTorch version (GPU?): 1.4.0
Using GPU in script?: yes
Using distributed or parallel set-up in script?: parallel

pytorch-transformers or pytorch-pretrained-bert version (or branch): current version of transformers.

Checklist

[yes ] I have read the migration guide in the readme. (pytorch-transformers;

MagicFrogSJTU commented 4 years ago

I have just solved this problem. You have to set an additional flag: --do_lower_case. I wonder why the run_squad.py behaves differently than run_glue.py, etc. Is there is a code improve on the way?

LysandreJik commented 4 years ago

You shouldn't have to set --do_lower_case as it should be lowercased by default for that model.

MagicFrogSJTU commented 4 years ago

You shouldn't have to set --do_lower_case as it should be lowercased by default for that model.

I thought it is and it should be, but it isn't

julien-c commented 4 years ago

Closing this b/c #4245 was merged

(we still need to investigate why the lowercasing is not properly populated by the model's config)

huggingface / transformers

why squad.py did not reproduce squad1.1 report result? #4301

📚 Migration

Information

Details

Environment info

Checklist