Inter-annotator agreements

GateNLP / gate-teamware

A web application for collaborative document annotation.

https://gatenlp.github.io/gate-teamware/

GNU Affero General Public License v3.0

4 stars 0 forks source link

Inter-annotator agreements #27

Open twinkarma opened 3 years ago

twinkarma commented 3 years ago

Support calculator IAA for
- Feature where you can apply a nominal, ordinal or continuous scale i.e. only single selectable option
- Radio buttons, Selector, Checkboxes where you can only select a single option
- Multiple choice to be supported later, we'd require set arithmatic
Extract for each pair of annotators all the annotations they have done
Calculate simple matches if doc classification
For sequence labelling, we need to do this with begin/end offsets and per annotation type

davidwilby commented 2 years ago

discuss with Xingyi (@deansong), he has scripts for doing this

davidwilby commented 2 years ago

rules in old teamware are same as rules in GATE developer

twinkarma commented 1 year ago

Check with team for existing inter-annotator agreement code.
Also ask the team for testing data to double check this

twinkarma commented 1 year ago

Common remedial action

Remove all annotator's annotation from project, remove from project
Revise guideline for all annotators (will start new project in this case)

ianroberts commented 1 year ago

Things to consider - different types of IAA metrics for different types of task

binary / "nominal" classification - exact match is "correct", anything else is "wrong"
Likert-type "ordinal" scales with a finite number of options which are ordered, so e.g. 4 vs 5 is "closer" than 4 vs 1
continuous numerical scales

Maybe this needs to be part of the project configuration for each widget, so we know what sort of metrics to use for that particular question.

ianroberts commented 1 year ago

Variables may not be independent, e.g. a classification plus associated confidence score - do we want to use the confidence score as a weighting of some kind or as a threshold (e.g. only compute IAA on pairs where both annotators confidence is > 4)

twinkarma commented 1 year ago

We need to check what the standard procedure is for comparing multiple choice agreement

davidwilby commented 1 year ago

Depends upon defined scale for a particular project #349