I found some issues while running your latest codes on SQuAD2.0.
1.data_utils.py/feature_func:
if is_train:
fea_dict['label'] = sample['label']
It should be used while running on SQuAD2.0 while it should be omitted while running on SQuAD1.1.
Otherwise there will be an error that warns you that it can't find 'label'.
2.data_utils.py/predict_squad:
for batch in data:
if opt.get('v2_on', False):
phrase, spans, scores = model.predict(batch)
else:
phrase, spans = model.predict(batch)
It should be changed like this, otherwise there will be an error that warns you of incorrect number of outputs of this function.
3.train.py/main:
model_file = os.path.join(model_dir, 'checkpointepoch{}.pt'.format(epoch))
model.save(model_file, epoch)
if em + f1 > best_em_score + best_f1_score:
copyfile(os.path.join(model_dir, model_file), os.path.join(model_dir, 'best_checkpoint.pt'))
best_em_score, best_f1_score = em, f1
logger.info('Saved the new best model and prediction')
logger.warning("Epoch {0} - dev EM: {1:.3f} F1: {2:.3f} (best EM: {3:.3f} F1: {4:.3f})".format(epoch, em, f1, best_em_score, best_f1_score))
if metric is not None:
logger.warning("Epoch {0}: {1}".format(epoch, metric))
The result is a mixture of SQuAD1.1 and SQuAD2.0, so we should save the model respectively.
4.train.py/main:
else:
result = evaluate(dev_gold, results)
em, f1 = result['exact_match'], result['f1']
output_path = os.path.join(model_dir, 'devoutput{}_v1.1.json'.format(epoch))
with open(output_path, 'w') as f:
json.dump(results, f)
'Result' is a dictionary so that em and f1 should be gained using result['exact_match'] and result['f1'].
Hi, thanks for your nice work!
I found some issues while running your latest codes on SQuAD2.0.
1.data_utils.py/feature_func: if is_train: fea_dict['label'] = sample['label'] It should be used while running on SQuAD2.0 while it should be omitted while running on SQuAD1.1. Otherwise there will be an error that warns you that it can't find 'label'.
2.data_utils.py/predict_squad: for batch in data: if opt.get('v2_on', False): phrase, spans, scores = model.predict(batch) else: phrase, spans = model.predict(batch) It should be changed like this, otherwise there will be an error that warns you of incorrect number of outputs of this function.
3.train.py/main: model_file = os.path.join(model_dir, 'checkpointepoch{}.pt'.format(epoch)) model.save(model_file, epoch) if em + f1 > best_em_score + best_f1_score: copyfile(os.path.join(model_dir, model_file), os.path.join(model_dir, 'best_checkpoint.pt')) best_em_score, best_f1_score = em, f1 logger.info('Saved the new best model and prediction') logger.warning("Epoch {0} - dev EM: {1:.3f} F1: {2:.3f} (best EM: {3:.3f} F1: {4:.3f})".format(epoch, em, f1, best_em_score, best_f1_score)) if metric is not None: logger.warning("Epoch {0}: {1}".format(epoch, metric)) The result is a mixture of SQuAD1.1 and SQuAD2.0, so we should save the model respectively.
4.train.py/main: else: result = evaluate(dev_gold, results) em, f1 = result['exact_match'], result['f1'] output_path = os.path.join(model_dir, 'devoutput{}_v1.1.json'.format(epoch)) with open(output_path, 'w') as f: json.dump(results, f) 'Result' is a dictionary so that em and f1 should be gained using result['exact_match'] and result['f1'].
I'm very pleased with your codes!