squaresLab / MultiEdit_Experiments

What are multiedit bugs?
2 stars 0 forks source link

Run stat tests on the partial repairs statistics #47

Open zhenyudg opened 4 years ago

zhenyudg commented 4 years ago

Partial repair identification:

Granularity


Notes from Zhen on the Friedman test and why I think it works:

Classic analogies for when to use the Friedman test are:

n wine judges each rate k different wines. Are any of the k wines ranked consistently higher or lower than the others?

The 3 granularity levels are the k wines. The 1884+596 partial repairs are the n judges. Pos/Neu/Neg change in tests are the rankings.

n welders each use k welding torches, and the ensuing welds were rated on quality. Do any of the k torches produce consistently better or worse welds?

The 3 granularity levels are the k welding torches. The 1884+596 partial repairs are the n welders. Pos/Neu/Neg change in tests are the better/worse welds.

zhenyudg commented 4 years ago

Alternatively, if we don't use weighting of partial repairs, we could work with Categorical->Categorical data. In that case, we can use a chi^2 test.