Probably multiple Mann-Whitney U tests, or a Kruskal-Wallis H test
Granularity
Paired (Triplets) categorical -> quantitative
Probably multiple Wilcoxon signed-rank tests, or a Friedman test.
Notes from Zhen on the Friedman test and why I think it works:
Classic analogies for when to use the Friedman test are:
n wine judges each rate k different wines. Are any of the k wines ranked consistently higher or lower than the others?
The 3 granularity levels are the k wines. The 1884+596 partial repairs are the n judges. Pos/Neu/Neg change in tests are the rankings.
n welders each use k welding torches, and the ensuing welds were rated on quality. Do any of the k torches produce consistently better or worse welds?
The 3 granularity levels are the k welding torches. The 1884+596 partial repairs are the n welders. Pos/Neu/Neg change in tests are the better/worse welds.
Partial repair identification:
Granularity
Notes from Zhen on the Friedman test and why I think it works:
Classic analogies for when to use the Friedman test are:
The 3 granularity levels are the k wines. The 1884+596 partial repairs are the n judges. Pos/Neu/Neg change in tests are the rankings.
The 3 granularity levels are the k welding torches. The 1884+596 partial repairs are the n welders. Pos/Neu/Neg change in tests are the better/worse welds.