Closed PROoshio closed 4 years ago
There are two flags implemented for the purpose:
parser.add_argument("--save_best", action='store_true',
help="Save best epoch.")
parser.add_argument("--only_eval_best", action='store_true',
help="Only evaluate best epoch.")
We used --save_best
on the dev set, and then added --only_eval_best
for the final evaluation.
@donglixp Hi i am looking how to evaluate after using the decode code of s2sft or question generation on SQUAD ( I finetuned the model before)
@donglixp Hi i am looking how to evaluate after using the decode code of s2sft or question generation on SQUAD ( I finetuned the model before)
Please refer to https://github.com/microsoft/unilm/tree/master/unilm-v1#question-generation---squad for the evaluation of QG.
作者你好,请问论文给出的结果是最后一个epoch的评估结果,还是每个epoch dev集最高的测试结果呢?另外如果是第一种random seed有放开么?感谢~