BrambleXu / knowledge-graph-learning

A curated list of awesome knowledge graph tutorials, projects and communities.
MIT License
736 stars 120 forks source link

EMNLP-2019/11-CrossWeigh: Training Named Entity Tagger from Imperfect Annotations #263

Open BrambleXu opened 4 years ago

BrambleXu commented 4 years ago

Summary:

更正了test set里的标注错误,然后通过对句子评分,判断潜在的标注错误,然后将这些句子的权重降低。这样学到的NER模型是能意识到标注错误的。

Resource:

Paper information:

Notes:

NER的标注错误有两种,一种是test set里会影响验证的结果,第二种是training set里的mistakes会影响训练出的NER模型。这篇文章手工修正了CoNLL03里test set里的标注错误,然后在各种模型上进行了测试。然后提出了一个新的框架,CrossWeight,来解决训练过程中的label mistabkes。

image

CrossWeigh分成两部分,预测错误和错误权重调整

  1. mistake estimation: it identifies the potential label mistakes in training data through a cross-checking process
  2. mistake re-weighing: it lowers the weights of these instances during the training of the final NER model. The cross-checking process is inspired by the k-fold cross validation; differently, in each fold’s training data, it removes the data containing any of entities that appeared in this fold.

Model Graph:

Result:

Thoughts:

Next Reading: