PSIAIMS / CAMIS

https://psiaims.github.io/CAMIS/
Apache License 2.0
61 stars 60 forks source link

WIlcoxon (Mann-Whitney) test #171

Closed adrianolszewski closed 6 months ago

adrianolszewski commented 8 months ago

Hello,

I would like to address two topics. These things aren't essential from the CAMIS perspective, so can be ignored, but I think it's good to keep precision anyway.

1) in the example for wilcox.test it's mentioned that it "tests whether the median difference between pairs is equal to zero." A sentence later: "It is the non-parametric equivalent to two-sample t-test, where the two groups are not paired."

This is a bit imprecise in general. The MW(W) test neither tests the difference in medians nor median difference in general, unless strong distributional assumptions are met: 1) both samples are IID (same dispersion, same shape; not necessarily normal) and 2) are symmetric around their medians. It's very easy to obtain very low p-value for both difference in medians=0 and median difference = 0. ( https://github.com/adrianolszewski/Mann-Whitney-is-not-about-medians-in-general ) This test is about Prob(A>B) > Prob(B>A).

Second, if we say about "not paired" data, then why do we say "difference between pairs"? This might be confusing and may suggest paired case. If we refer to the Hodges-Lehmann estimator of pseudo-median, then it's median of all Walsh averages: the means of each possible pair of values in both samples set, including the pair of each value with itself. More details can be found e.g. here: https://aakinshin.net/posts/r-hodges-lehmann-problems/ https://support.minitab.com/en-us/minitab/help-and-how-to/statistics/nonparametrics/how-to/1-sample-wilcoxon/methods-and-formulas/methods-and-formulas/ https://stats.stackexchange.com/questions/215889/prove-the-relationship-between-walsh-averages-and-wilcoxon-signed-rank-test

Third, I would be careful with saying it's "counterpart of the t-test", which suggests (at least to me), that this test does "about the same, only in non-parametric way". It cannot do the same since the null distributions of the two tests are very different. So, if we decide to use the MW(-W), then we need to remember that parametric t- test and non-parametric MW(W) look at the data from different perspectives, which reduce to comparing means only under strong distributional assumptions for the MW(-W), which may lack power anyway in this case. As a conclusion, I would skip the part referring to t-test. Simply - it just compares 2 samples in a non-parametric way.

/ side note: If the assumptions are not met, we can still use a test that preserves the null hypothesis (even though it's now non-parametric): the permutation or bootstrapped (under shifted means to work under true H0) t-test (and Welch t-test and Yuen-Welch t-test), as implemented, for instance, in the MKinfer package. This one actually does about the same the same in a non-parametric way. /

2) In R there are 3 main implementations of the MW(W) test:

If someone has the access to SAS, it will be worth checking the 3 options, as I expect solid differences in case of ties in data (e.g. when analysing scores, doses, questionnaire answers, stages, and so on).

I would also check the results for the pseudomedian against this blog: https://aakinshin.net/posts/r-hodges-lehmann-problems/

DrLynTaylor commented 7 months ago

Thanks so much @adrianolszewski , we really value your input & expertise to help us ensure our language is accurate and precise. Agnieszka & I are continuing to look at signed rank, wilcoxon rank sum & Hodges lehman, so we will go through your comments in detail and ensure the website gets updated. Many thanks for you help in highlighting these points.