mistake estimation: it identifies the potential label mistakes in training data
through a cross-checking process
mistake re-weighing: it lowers the weights of these instances during the training of the final NER model. The cross-checking process is inspired by the k-fold cross validation; differently, in each fold’s training data, it removes the data containing any of entities that appeared in this fold.
Summary:
更正了test set里的标注错误,然后通过对句子评分,判断潜在的标注错误,然后将这些句子的权重降低。这样学到的NER模型是能意识到标注错误的。
Resource:
Paper information:
Notes:
NER的标注错误有两种,一种是test set里会影响验证的结果,第二种是training set里的mistakes会影响训练出的NER模型。这篇文章手工修正了CoNLL03里test set里的标注错误,然后在各种模型上进行了测试。然后提出了一个新的框架,CrossWeight,来解决训练过程中的label mistabkes。
CrossWeigh分成两部分,预测错误和错误权重调整
Model Graph:
Result::
Thoughts:
Next Reading: