Run stat tests on the partial repairs statistics

Partial repair identification:

N-sample categorical -> quantitative
Probably multiple Mann-Whitney U tests, or a Kruskal-Wallis H test

Granularity

Paired (Triplets) categorical -> quantitative
Probably multiple Wilcoxon signed-rank tests, or a Friedman test.

Notes from Zhen on the Friedman test and why I think it works:

Classic analogies for when to use the Friedman test are:

n wine judges each rate k different wines. Are any of the k wines ranked consistently higher or lower than the others?

The 3 granularity levels are the k wines. The 1884+596 partial repairs are the n judges. Pos/Neu/Neg change in tests are the rankings.

n welders each use k welding torches, and the ensuing welds were rated on quality. Do any of the k torches produce consistently better or worse welds?

The 3 granularity levels are the k welding torches. The 1884+596 partial repairs are the n welders. Pos/Neu/Neg change in tests are the better/worse welds.

squaresLab / MultiEdit_Experiments

Run stat tests on the partial repairs statistics #47