Closed MattYoon closed 2 years ago
Hi, @MattYoon: Our baselines only trained on the train set. The results may be influenced by the different hardware, you can change the hyper-parameters and train again.
Thank you for your fast response!
Did you use early stopping to obtain the test results? i.e. when it says a certain model was trained for 5 epochs, did you pick the best performing epoch based on the dev result?
Yes, we set the training epochs in advance and select the best model based on the dev results. We don't use early stopping on CMedEE, CMedIE, CDN, and CTC tasks.
请问下 基线的得分是在test 上面的还是dev上面的
Hi.
Were the baselines trained on both train set and dev set before testing, or was it trained on train set only?
I used the exact hyper-parameters mentioned on your paper, and used the baseline codes to test out
hfl/chinese-bert-wwm-ext
onCMeEE
. I got some confusing results as with train + dev I got 62.8, with train only I got 60.7. The test score mentioned in your paper is 61.7.Can you please tell me which way the baselines were tested? Thank You.