Closed tianjianjiang closed 6 years ago
It is a normal situation for a small difference. Please show your result for comparison.
Hi @GabrielLin,
I understand that it usually involves randomness. IMHO the differences are somewhat too big. For example,
MSR
Train Epoch 29 loss 0.426792n 446.87 (sec) << Valid Epoch 29 loss 2.683211 P:0.974043 R:0.976689 F:0.975364 Test: P:0.975820 R:0.979153 F:0.977484 Best_F:0.976075
P:0.976320 R:0.978713 F:0.977515
PKU
Train Epoch 6 loss 3.556927n 849.97 (sec) << Valid Epoch 6 loss 15.523698 P:0.964023 R:0.955136 F:0.959559 Test: P:0.956657 R:0.940425 F:0.948472 Best_F:0.960828
P:0.956911 R:0.949201 F:0.953041
I have tried to set random number seeds for data shuffle, numpy, and tensorflow. Additionally GPU has been fixed to one. Numbers still fluctuate.
Since PKU seems having only a few epochs, so I've tried continue training after early stopping. With 4 rounds, 6+33+43+5 epochs in total, it converges in my environment.
Train Epoch 6 loss 3.573669n 426.94 (sec) << Valid Epoch 6 loss 15.409616 P:0.964029 R:0.957178 F:0.960591 Test: P:0.957498 R:0.943251 F:0.950321 Best_F:0.962007 P:0.958340 R:0.951251 F:0.954782
Train Epoch 33 loss 0.737350n 434.54 (sec) << Valid Epoch 33 loss 10.526677 P:0.971944 R:0.968199 F:0.970068 Test: P:0.965097 R:0.954269 F:0.959653 Best_F:0.970744 P:0.961356 R:0.954586 F:0.957959
Train Epoch 43 loss 0.503710n 308.23 (sec) << Valid Epoch 43 loss 10.408177 P:0.972928 R:0.969850 F:0.971386 Test: P:0.966005 R:0.957805 F:0.961887 Best_F:0.971856 P:0.964821 R:0.959635 F:0.962221
Train Epoch 5 loss 1.404060n 303.74 (sec) << Valid Epoch 5 loss 7.886835 P:0.975593 R:0.973874 F:0.974733 Test: P:0.966675 R:0.957460 F:0.962046 Best_F:0.975528 P:0.968025 R:0.960104 F:0.964048
Train Epoch 5 loss 1.404064n 300.96 (sec) << Valid Epoch 5 loss 7.886860 P:0.975593 R:0.973874 F:0.974733 Test: P:0.966685 R:0.957479 F:0.962060 Best_F:0.975528 P:0.968025 R:0.960104 F:0.964048
Set initial lr=0.01 to train pku. The pre trained file for model2 has been updated
@fudannlp16 I see. So that was the step 2 "set the hyperparameter of config.py according to the paper" about, I failed to comprehend that before.
Setting lr=0.01 indeed improved F1 of PKU to the level according to the paper.
For MSR, however, I've only got it right one time out of ten. Since the difference between Model-I and Model-II for MSR is relatively small (97.8% - 97.6% = 0.2%), and the ten runs in my environment so far are ranging from 97.74% to 97.82%. IMHO the range is borderline acceptable. If your experiments had the same behavior, I will rest my case.
Thank you for all the support.
I've used
python 2.7.12
andtensorflow-gpu 1.0.0
inUbuntu 16.04
to try to reproduce same-domain experiments, but so far only obtained different (lower) test scores of PKU and MSR for model2. Please advise.Some more info about my environment:
GeForce GTX 1080 Ti
* 2CUDA 8.0.61
CuDNN 5.1.10