dice-group / gerbil

GERBIL - General Entity annotatoR Benchmark
GNU Affero General Public License v3.0
224 stars 58 forks source link

Add TAC-KBP Experiments #8

Closed MichaelRoeder closed 10 years ago

MichaelRoeder commented 10 years ago

Add experiment types from http://nlp.cs.rpi.edu/kbp/2014/KBP2014EL_V0.2.pdf

RicardoUsbeck commented 10 years ago

Make sure that the licence is not violated when the data is transformed to the annotation backend. Keep in mind that we only want open data and open source software.

giusepperizzo commented 10 years ago

similarly to #16, what does open data mean?

giusepperizzo commented 10 years ago

TAC KBP 2014 scorer: https://github.com/wikilinks/neleval

I propose to use the scorer as it is in the official repository and formatting both GS and system outputs in order to fit the expected inputs of the scorer.

Few aspects to consider: an entity is defined as an ordered list of the following features: doc_id,startOffset, endOffset,uri,salience,type

For the majority of the systems supported in GERBIL (and in NERD) doc_id, start and end offset, uri are available. Differently for the type. For instance in Babelfy we should retrieve it from a Wikipedia page or so (am I mistaken?). Similarly for the salience score.

RicardoUsbeck commented 10 years ago

Please refine what you mean by experiment types, matchings and evaluation measures and open new and separate issues for all of them

RicardoUsbeck commented 10 years ago

see #48 #49. @giusepperizzo Are there more experiment types we have to cover?

rtroncy commented 10 years ago
RicardoUsbeck commented 10 years ago

Just to clarify:

rtroncy commented 10 years ago

Right, but the world is complex.

RicardoUsbeck commented 10 years ago

better we move the definition to the paper and the discussion to the mailing list.