MantisAI / nervaluate

Full named-entity (i.e., not tag/token) evaluation metrics based on SemEval’13
MIT License
154 stars 19 forks source link

crucial fixes for evaluation #32

Closed aflueckiger closed 3 years ago

aflueckiger commented 4 years ago

I am part of the team behind the shared task CLEF-HIPE-2020, for which I adapted @davidsbatista's evaluation procedure. It seems that the scorer is maintained by you meanwhile, so it makes sense to open the pull request here.

During our sanity checks, we came across several crucial bugs in the original code. These bugs are severe as they lead to incorrect evaluation results in some cases. Currently, they are not caught by your tests.

Please check this pull request that fixes the following:

The diff looks messier than it is as I use an automatic code- formatter and forgot to deactivate beforehand, sorry.

Currently, two unit tests fail. I am not sure if this is related to my changes, as it concerns only two functions. Please check by yourself so that we have a double-check.

Happy to answer questions if there are any.

ivyleavedtoadflax commented 3 years ago

Thanks for this @aflueckiger - and sorry it has taken me so long to get to it. I'll review this today.

ivyleavedtoadflax commented 3 years ago

Many thanks for this @aflueckiger, I've updated the tests which were failing due to this:

an over-generated entity with a valid tag should be attributed to the respective tag as FP only and not to all tags as currently done

Your scorer looks great. If you want to incorporate that functionality into this package, you are very welcome to, and this is something I would support.