Open Danielfile opened 3 years ago
Thanks for being interested in our paper.
We reported the experimental results of the baseline models when they are run on our machines. Due to machine variance, it is possible that you may get different results with the same hyper-parameter setting.
I run your codes many times, but I can't reproduce the effect with no WM. For example, the (F, Roov) scores of PKU are (0.966448, 0.868110) ("BERT+SM+noWM") and (0.966335, 0.865612) ("BERT+CRF+noWM"), respectively. But the corresponding scores shown in your paper are (96.20, 84.43) and (96.32, 85.04), respectively. Why is this happening?