tricktreat / locate-and-label

Code for "Locate and Label: A Two-stage Identifier for Nested Named Entity Recognition", accepted at ACL 2021.
102 stars 18 forks source link

entity classifier 为何选用cross entropy loss? #19

Closed terenceau2 closed 7 months ago

terenceau2 commented 7 months ago

作者你好,

你在paper里说明,span filter用focal loss,主要原因是有大量的‘non entity’ 和少量的entity,和object detection的情况相似,用focal loss可以避免这些无关的background data dominate 这个span filter 的loss

同理,entity classifier也是要forward pass大量的candidate spans。在训练时,span filter的判断结果不影响被forward pass到entity classifier 的data。即是说,有100个span forward pass给span filter,那同样也会有100个span forward pass给entity classifier (现实上会略略少于100,由于boundary offset部分可能会有些会由于illegal boundary被筛走) 既然如此,为何paper中,提出entity classifier用 cross entropy loss而非focal loss?这个处境和span filter是一摸一样的

tricktreat commented 7 months ago

您好,训练时,span filter的判断结果可以影响被forward pass到entity classifier 的data,可以通过设置的spn_mask过滤第一阶段负样本:

https://github.com/tricktreat/locate-and-label/blob/0e05376fb0174e7eacf66d3c6443a8f1dcab8d5d/identifier/loss.py#L83

但我这里注释了,原因可能是效果差不多。