umcu / negation-detection

Negation detection in Dutch clinical text.
GNU General Public License v3.0
3 stars 0 forks source link

Score the performance #5

Closed lcreteig closed 2 years ago

lcreteig commented 3 years ago

Evaluate the performance of the rules as in the ContextD paper (precision, recall, F1-score, separate for each document type).

For each labeled entity in a doc (for ent in doc.ents):

Note that the other approaches use some part of the ECC dataset for training, so coordinate with them which split to actually evaluate on

lcreteig commented 3 years ago

Make this congruent with the general evaluation code, so that we can join the predictions from all models for each entity:

Right now most of the rule-based "pipeline" occurs in the preprocessing step. Could also consider matching the pipeline design of the other models, where the raw text is loaded in to make a prediction. In that case, everything would happen in one pipeline, including tokenisation, labelling, and sentence splitting.