wikilinks / conll03_nel_eval

Python evaluation scripts for AIDA-formatted CoNLL data
Apache License 2.0
20 stars 4 forks source link

Verify against published results #9

Closed benhachey closed 10 years ago

wejradford commented 10 years ago

I've added to the README and reported results .csv file. What do you think?

benhachey commented 10 years ago

I skimmed back over the Hoffart (2011) paper: "We consider only mention-entity pairs where the ground-truth gives a known entity, and thus ignore roughly 20% of the mentions without known entity in the ground-truth."

We should address this when we add rank measures (see #10).

benhachey commented 10 years ago

Hoffart et al. (2012) state that they follow the Hoffart et al. (2011) evaluation methodology.

benhachey commented 10 years ago

So, it's very good to have the Cornoloti et al. numbers there and the comparison should be correct, especially once mapping is up and running (#11).

It's nice to have the other numbers as well, with the caveat that there is no direct comparison until rank measures are added in a future release (#10).

wejradford commented 10 years ago

I think the best way to handle this is to specify a subset of mentions for a filter operation. We may be able to use the means file.

benhachey commented 10 years ago

I'm trying to get TagMe2 output. Let's use that if we can.