nilsreiter / CorefAnnotator

Annotation tool for coreference
Apache License 2.0
32 stars 6 forks source link

Possibility to also export unused entities #375

Closed michaelgoeggelmann closed 7 months ago

michaelgoeggelmann commented 3 years ago

For some steps in the further processing of annotations, it would be helpful to be able to export the unannotated entities as well.

bkis commented 2 years ago

I was about to try to implement this, but now I'm not sure I understand the requested feature. Maybe I could get some clarification :wink:
Does this refer to entities that are present in the entity tree but don't have any associated mentions? Like "eirmod" in this example?

image

Because I wonder how these should be represented in the export data. In a TEI export, annotations are made "inline" - so the mention is annotated by referencing an entity:

image

If there is no mention, where should the annotation be made?

As for CSV exports, I see a similar problem: For the columns A to E in the following example export, there would be no valid values for an entity without any in-text mentions:

image

But perhaps I got this all wrong?

michaelgoeggelmann commented 2 years ago

Thanks for the hints, the problem was definitely understood correctly, but it is probably more of a "luxury problem". This request was made to me in cases where the annotations are to be evaluated via the CSV files and manually adding unused annotations would take a very long time, e.g. many unused entities previously defined in a profile file. In a single case, such as the unused entity "eirmod", this would certainly not be a problem. One possibility might be to ask in the export process of the coref annotator whether unused entities should be exported as well. If so, but here I'm not sure, you could simply tag a space at the end with the unused entities, or use N.A. on all fields in the CSV except entityLabel. The point, after all, is to not have to type everything out and make ALL entity names available for evaluations.

nilsreiter commented 7 months ago

This will not be added any time soon. If needed, un-annotated entities can be extracted from the XMI file.