Open twinkarma opened 3 years ago
discuss with Xingyi (@deansong), he has scripts for doing this
rules in old teamware are same as rules in GATE developer
Common remedial action
Things to consider - different types of IAA metrics for different types of task
Maybe this needs to be part of the project configuration for each widget, so we know what sort of metrics to use for that particular question.
Variables may not be independent, e.g. a classification plus associated confidence score - do we want to use the confidence score as a weighting of some kind or as a threshold (e.g. only compute IAA on pairs where both annotators confidence is > 4)
We need to check what the standard procedure is for comparing multiple choice agreement
Depends upon defined scale for a particular project #349
Support calculator IAA for
Extract for each pair of annotators all the annotations they have done
Calculate simple matches if doc classification
For sequence labelling, we need to do this with begin/end offsets and per annotation type