kyzhouhzau / BERT-NER

Use Google's BERT for named entity recognition (CoNLL-2003 as the dataset).
MIT License
1.24k stars 335 forks source link

Result of NER #42

Open sbmaruf opened 5 years ago

sbmaruf commented 5 years ago

Your final result seems,

accuracy:  98.07%; precision:  90.65%; recall:  88.29%; FB1:  89.45
              LOC: precision:  92.50%; recall:  91.71%; FB1:  92.10  1387
             MISC: precision:  82.63%; recall:  76.99%; FB1:  79.71  668
              ORG: precision:  88.75%; recall:  84.22%; FB1:  86.43  1191
              PER: precision:  94.51%; recall:  94.72%; FB1:  94.62  1311

Result description: As Google's paper says a 0.2% error is reasonable(reported 92.4%).

How can this result is comparable to google's result. google's result was 92.4 for BERT base and 92.8 for BERT large. This result is 89.45.

kyzhouhzau commented 5 years ago

Yes, you are right, but under the existing experimental conditions, I can‘t improve the results to about 92.4%. Maybe some tricks need to be used, or some parameters need to be adjusted.

sbmaruf commented 5 years ago

Hi @kyzhouhzau. Any follow-up in this? Have you find the workaround to reproduce the original result of the BERT paper.

zwd13122889 commented 4 years ago

@sbmaruf Hi,I want to aks a question.Do this model use the POS in NER?

zwd13122889 commented 4 years ago

Excuse me. Where does the label_test.txt come from? Man made or machine generated?

sbmaruf commented 4 years ago

@zwd13122889 See here, https://github.com/kyzhouhzau/BERT-NER/blob/master/BERT_NER.py#L190-L195 It only reads the first token and the last token. In ConLL dataset first token is the text and last token is the label of NER. So POS tag is used.

zwd13122889 commented 4 years ago

@sbmaruf OK.How long will it take me to finish this script with a gpu?

sbmaruf commented 4 years ago

It depends on what GPU you are using. I forgot but it should not take more than 2-3 hours in GTX 1080ti/2080ti.

zwd13122889 commented 4 years ago

OK. I run my own data. But i have some problem show in the picture: 微信截图_20191030151556

the left is author's data ,the right is mine

yuhongqian commented 4 years ago

Hi, does anyone know what 98.07% on the first line mean?

sbmaruf commented 4 years ago

Hi, it's accuracy. For multi-class classification accuracy is not a good measurement.

yuhongqian commented 4 years ago

Oh I see! I was fooled by the format. Thanks a lot!