Closed MXueguang closed 4 years ago
bert-base-uncased
and the bert-large model is based on bert-large-uncased-whole-word-masking
. What do you mean by "there isn't a checkpoint that match rsvp-ai/bertserini-bert-base-squad exactly". For the large model, I evaluate on the "best" checkpoint. The difference in score might come from this.bert-large-uncased-whole-word-masking-finetuned-squad
, forget to change the name back in the training script, will change back.
Hi, I opened this PR to discuss issues&questions that I met during replication and make corresponding modification.
Here are two questions first:
bert-base-uncased
using the parameters stated intrain.sh
, however, it seems there isn't a checkpoint that matchrsvp-ai/bertserini-bert-base-squad
exactly. I suppose thersvp-ai/bertserini-bert-large-squad
is fine-tuned frombert-large-uncased-whole-word-masking
? I tried to replicate this too (using parameters stated intrain.sh
), my last checkpoint gives(0.5, {'exact_match': 42.100283822138124, 'f1': 49.63275436249586, 'recall': 51.23401819043994, 'precision': 50.18438555675959, 'cover': 47.38883632923368, 'overlap': 57.994323557237465})
train.sh
file, themodel_name_or_path
usedbert-large-uncased-whole-word-masking-finetuned-squad
. Is any of the tworsvp-ai/<bert_squad>
model fine-tuned frombert-large-uncased-whole-word-masking-finetuned-squad
? Isn't it fine-tuned already based onbert-large-uncased-whole-word-masking
? I evaluated thebert-large-uncased-whole-word-masking-finetuned-squad
model, and added the results in evaluation part.(0.5, {'exact_match': 43.65184484389783, 'f1': 50.942504639546485, 'recall': 52.32886737510793, 'precision': 51.54318623526059, 'cover': 48.57142857142857, 'overlap': 58.63765373699149})