wikilinks / neleval

Entity disambiguation evaluation and error analysis tool
Apache License 2.0
116 stars 23 forks source link

Within-document evaluation mode for cross-doc coreference evaluation #19

Closed jnothman closed 8 years ago

jnothman commented 8 years ago

Should be able to evaluate (the micro-average over documents of) within-document coreference resolution performance. With the current implementation the following approaches exist:

Note that the former approach breaks for the pairwise_negative aggregate, as true negatives from across the corpus will be counted.

My current preferred solution is to add an option to evaluate: which fields to break the calculation down by, ordinarily 'doc' but perhaps also 'type' would be of interest. Evaluate would then calculate all measures over each, then add results for micro-average and macro-average. This would also mean we can rename the aggregate sets-micro to sets.

Thanks for expressing the need for this, @shyamupa