Closed michelole closed 4 years ago
The current Python code for statistical testing uses the randtest library for approximate randomization, which assumes groups are independent.
randtest
However, our experimental units (topics) are paired (i.e., we run all experiments on all of them) and therefore there is a dependence among runs.
This commit reimplements statistical testing using the popular R coin library on top of the existing code for parsing trec_eval and sampleval files.
coin
trec_eval
sampleval
The current Python code for statistical testing uses the
randtest
library for approximate randomization, which assumes groups are independent.However, our experimental units (topics) are paired (i.e., we run all experiments on all of them) and therefore there is a dependence among runs.
This commit reimplements statistical testing using the popular R
coin
library on top of the existing code for parsingtrec_eval
andsampleval
files.