juntaoy / biaffine-ner

Named Entity Recognition as Dependency Parsing
Apache License 2.0
350 stars 39 forks source link

question about code #13

Closed onehaitao closed 4 years ago

onehaitao commented 4 years ago

hi,your code is great, but I still have a question about code (biaffine_ner_model.py from line 249 to line 256).

 top_spans = [[] for _ in xrange(len(sentences))]
    for i, type in enumerate(np.argmax(span_scores,axis=1)):
      if type > 0:
        sid, s,e = candidates[i]
        top_spans[sid].append((s,e,type,span_scores[i,type]))

 top_spans = [sorted(top_span,reverse=True,key=lambda x:x[3]) for top_span in top_spans]

Here, you sort the predicted spans in the descending order of span sores. I read your code and find span scores is logits rather than probability , and I wonder why it is.

juntaoy commented 4 years ago

Hi, the reason we use the logits instead of probability is that we don't want to select spans that the system is not sure. Say we have a span that has a very low score for every category, and that means the system is unsure and to use logits directly we do understand how certain the system is and not biased by scores from other categories.

onehaitao commented 4 years ago

Thank you very much.