Some issues while running on SQuAD2.0

qjzhzw commented 5 years ago

Hi, thanks for your nice work!

I found some issues while running your latest codes on SQuAD2.0.

1.data_utils.py/feature_func: if is_train: fea_dict['label'] = sample['label'] It should be used while running on SQuAD2.0 while it should be omitted while running on SQuAD1.1. Otherwise there will be an error that warns you that it can't find 'label'.

2.data_utils.py/predict_squad: for batch in data: if opt.get('v2_on', False): phrase, spans, scores = model.predict(batch) else: phrase, spans = model.predict(batch) It should be changed like this, otherwise there will be an error that warns you of incorrect number of outputs of this function.

3.train.py/main: model_file = os.path.join(model_dir, 'checkpointepoch{}.pt'.format(epoch)) model.save(model_file, epoch) if em + f1 > best_em_score + best_f1_score: copyfile(os.path.join(model_dir, model_file), os.path.join(model_dir, 'best_checkpoint.pt')) best_em_score, best_f1_score = em, f1 logger.info('Saved the new best model and prediction') logger.warning("Epoch {0} - dev EM: {1:.3f} F1: {2:.3f} (best EM: {3:.3f} F1: {4:.3f})".format(epoch, em, f1, best_em_score, best_f1_score)) if metric is not None: logger.warning("Epoch {0}: {1}".format(epoch, metric)) The result is a mixture of SQuAD1.1 and SQuAD2.0, so we should save the model respectively.

4.train.py/main: else: result = evaluate(dev_gold, results) em, f1 = result['exact_match'], result['f1'] output_path = os.path.join(model_dir, 'devoutput{}_v1.1.json'.format(epoch)) with open(output_path, 'w') as f: json.dump(results, f) 'Result' is a dictionary so that em and f1 should be gained using result['exact_match'] and result['f1'].

I'm very pleased with your codes!

namisan commented 5 years ago

Thanks for the suggestions. I'm still refactoring the codes and will fix these issues. You are also welcomed to check in codes.

yucoian commented 5 years ago

@namisan looking forward your early solution. thanks

kevinduh / san_mrc

Some issues while running on SQuAD2.0 #4