conll2003 run on gpu devF=0.90-0.91?

ivysoftware commented 5 years ago

Thank for your timely work! when running on GPU, conll2003 doesn't perform as good as you or paper result, by the way, I tried several times , the dev F score is wandering [0.89, 0.912]. Does your work run on TPU ?

ljch2018 commented 5 years ago

I met the same situation.

dsindex commented 5 years ago

i forked another repository which use bilstm-crf on the top of bert model.

https://github.com/dsindex/BERT-BiLSTM-CRF-NER/blob/master/README.md

this module yields 0.95~0.96 of f1 score on dev set. however, after aligning the predicted output(on test set) with original test data and evaluate it by using conlleval.pl(official evaluation tool). the final f-score is around 91.1 ~ 91.3. this is worse than the score reported on the paper(BERT-base, 92.4).

i guess there was additional tricks for parameter tuning.

rikhuijzer commented 5 years ago

Paper mentions that hyperparameters are tuned using the dev set. Note that the authors had quite a lot of incentive and means to tune it well.

linhlt-it-ee commented 4 years ago

I got only 0.87 with batch-size=16.

kyzhouhzau / BERT-NER

conll2003 run on gpu devF=0.90-0.91? #9