NVIDIA / DeepLearningExamples

State-of-the-Art Deep Learning scripts organized by models - easy to train and deploy with reproducible accuracy and performance on enterprise-grade infrastructure.
13.56k stars 3.23k forks source link

[Model] Got bad performance on BERT pre-train #427

Closed GuiminChen closed 4 years ago

GuiminChen commented 4 years ago

Hi friends, I need some help. I downloaded the wikipedia and bookcorpus for pre-training BERT large model. I did the same data preprocessing and pre-training task as create_datasets_from_start.sh and run_pretraining.sh, using SQuAD v1.1 to evaluating the performance of model. I pre-trained several times and the results about SQuAD v1.1 f1 score and Exact Match are 90.35/83.54, 89.79/83.26, 1%~2% lower than the scores reported in the readme: f1/EM(mean)=91.08/84.30 f1/EM(max)=91.29/84.50 f1/EM(min)=90.85/84.17 I'm confused and need help.

swethmandava commented 4 years ago

is this in TensorFlow or PyTorch? Could you attach a sample log, please?

GuiminChen commented 4 years ago

This is in PyTorch. I reset the dup of pretrain data as 10, it got better performance.