hipe-eval / HIPE-scorer

A python module for evaluating NERC and NEL system performances as defined in the HIPE shared tasks (formerly CLEF-HIPE-2020-scorer).
https://hipe-eval.github.io
MIT License
13 stars 4 forks source link

Add "pip install" support #26

Open EmanuelaBoros opened 7 months ago

EmanuelaBoros commented 7 months ago

This could be a first step towards the inclusion in evaluate.

simon-clematide commented 7 months ago

Interesting: The current scorer still relies on many preprocessing steps that are idiosyncratically bound to the HIPE format and evaluation scenario. In a way it could still be seen as an evaluation space. https://huggingface.co/evaluate-metric (similar to GLUE).

EmanuelaBoros commented 7 months ago

@simon-clematide (hoping I did not misunderstood) I don’t see it as such a problem that the metric depends on the annotation style (domain-dependent). I was raising the issue mainly because it could be easier to integrate on my side for the training of different models and of course, it could be easier to integrate in a metric such as seqeval.

EmanuelaBoros commented 7 months ago

One can see the metric as instead of multitask (CoNLL with columns for each task eg NER, chunking), some type of multilevel (columns in HIPE) - multilevel-seqeval 🙂