wikilinks / neleval

Entity disambiguation evaluation and error analysis tool
Apache License 2.0
116 stars 23 forks source link

select-alternatives preprocessor #25

Closed jnothman closed 7 years ago

jnothman commented 7 years ago

This is to allow the gold standard to have alternative facts!

So far only informally tested with the following data:

$ for f in test_fixtures/select-alternatives/*; do echo == $f ==; cat $f; done
== test_fixtures/select-alternatives/gold.txt ==
DOC1    0   1   E1  1.0 PER E2  1.0 ORG
DOC1    1   2   E3  1.0 PER E4  1.0 ORG E5  1.0 ORG
DOC1    3   4   E6  1.0 PER E7  1.0 ORG
DOC1    4   5   E8  1.0 PER E9  1.0 ORG
DOC1    5   6   E10 1.0 PER
== test_fixtures/select-alternatives/gold_exp_eid.txt ==
DOC1    0   1   E2  1.0 ORG
DOC1    1   2   E4  1.0 ORG
DOC1    3   4   E7  1.0 ORG
DOC1    4   5   E8  1.0 PER
DOC1    5   6   E10 1.0 PER
== test_fixtures/select-alternatives/gold_exp_eidtype.txt ==
DOC1    0   1   E1  1.0 PER
DOC1    1   2   E4  1.0 ORG
DOC1    3   4   E7  1.0 ORG
DOC1    4   5   E8  1.0 PER
DOC1    5   6   E10 1.0 PER
== test_fixtures/select-alternatives/sys.txt ==
DOC1    0   1   E2  1.0 PER
DOC1    100 101 E4  1.0 ORG
DOC2    0   1   E7  1.0 ORG

Then these tests pass:

diff <(./nel select-alternatives -g test_fixtures/select-alternatives/{gold,sys}.txt) test_fixtures/select-alternatives/gold_exp_eid.txt
diff <(./nel select-alternatives -f eid,type -g test_fixtures/select-alternatives/{gold,sys}.txt) test_fixtures/select-alternatives/gold_exp_eidtype.txt