Closed yangjianxin1 closed 2 years ago
Thanks for your interest in our paper.
This could be one reason. But what's more important is how to extract better words for each character. In other words, how to avoid assigning noise words for characters. According to my experience, the environment and Cuda version may also affect the results.
As for the Better baseline, it is well-known that there will be small fluctuations when finetuning pretrained-based models.
根据LEBERT论文的思想,我进行了实验复现,LEBERT基本达到了论文的指标,不过BERT模型的复现指标比论文中的效果要好一些。LEBERT在四个数据集上比BERT高出0.5-1.0个点,没有带来非常大的效果增益。不知道是否与词向量的质量有关。复现效果详见: https://github.com/yangjianxin1/LEBERT-NER-Chinese