castorini / bertserini

BERTserini
https://github.com/castorini/bertserini
Apache License 2.0
25 stars 10 forks source link

replicates evaluation scores #2

Closed MXueguang closed 4 years ago

MXueguang commented 4 years ago

Hi, I am trying to replicate the results by following README

Does ## BERT-large-wwm-uncased and ## BERT-base-uncased mentioned in evaluation results same to rsvp-ai/bertserini-bert-large-squad and rsvp-ai/bertserini-bert-base-squad respectively?

The evaluation results in my run gives: rsvp-ai/bertserini-bert-large-squad

(0.4, {'exact_match': 41.54210028382214, 'f1': 49.45378799697662, 'recall': 51.11983858400310
5, 'precision': 49.8395951713666, 'cover': 47.228003784295176, 'overlap': 57.6631977294229})

which is different from the score under ## BERT-large-wwm-uncased

Results for rsvp-ai/bertserini-bert-base-squad was same to the scores under ## BERT-base-uncased

amyxie361 commented 4 years ago

Yes, they are the same. Thank you! I reran the checkpoint and get the same result as you. Fixed in the readme.