thunlp / PL-Marker

Source code for "Packed Levitated Marker for Entity and Relation Extraction"
MIT License
260 stars 35 forks source link

P/R/F1 scores reported in results.json and paper #59

Closed IllDepence closed 1 year ago

IllDepence commented 1 year ago

Hi,

I ran your model on the provided SciERC data and get results very close to the ones reported in Table 3 of the paper.

I would now like to know how the precision, recall, and F1 scores are calculated given there are multiple entity classes/relation types. Specifically, I’m wondering

  1. Is the number reported in the paper and in results.json a macro average or a weighted average?
  2. Is NIL/NOTA (i.e. a token that is no entity/an entity pair with no relation) part of the calculation?

Thanks :)

YeDeming commented 1 year ago
  1. weighted average (micro-F1)
  2. NIL/NOTA is a part of denominator of the precision calculation .
IllDepence commented 1 year ago

thanks for the super fast reply

IllDepence commented 1 year ago

Quick follow-up question, just to make sure I understand correctly.

Regarding

NIL/NOTA is a part of denominator of the precision calculation

if we think of a scenario where we have entity classes A and B, then we have a false positive for class A not only when a B-entity is classified as A, but also when a token that is not an entity is classified as A, correct?

What is not clear to me is if you treat NIL/NOTA as a class itself. I.e, if a token that is not an entity is correctly predicted as not an entity, does this count as a true positive?

Thanks again for the clarification.

IllDepence commented 1 year ago

@YeDeming ping (just in case there’s not no notifications for comments)