AntoineSoetewey / statsandr

A blog on statistics and R aiming at helping academics and professionals working with data to grasp important concepts in statistics and to apply them in R. See www.statsandr.com
http://statsandr.com/
35 stars 15 forks source link

idea #7

Closed krzysiektr closed 4 years ago

krzysiektr commented 4 years ago

for the Wilcoxon test the medians are compared.

Documentation - section: Details: http://finzi.psych.upenn.edu/R/library/stats/html/wilcox.test.html:

Note that in the two-sample case the estimator for the difference in location parameters does not estimate the difference in medians (a common misconception) but rather the median of the difference between a sample from x and a sample from y.

# Hodges-Lehmann estimator:
> Boy <- subset(dat,Sex=="Boy")$Grade
> Girl <- subset(dat,Sex=="Girl")$Grade
> diff <- outer(Boy,Girl, "-")
> diff
      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12]
 [1,]   -3   -2    7   -1    8    9    0   -3   -4     7     5    -2
 [2,]  -14  -13   -4  -12   -3   -2  -11  -14  -15    -4    -6   -13
 [3,]   -4   -3    6   -2    7    8   -1   -4   -5     6     4    -3
 [4,]  -17  -16   -7  -15   -6   -5  -14  -17  -18    -7    -9   -16
 [5,]   -5   -4    5   -3    6    7   -2   -5   -6     5     3    -4
 [6,]   -4   -3    6   -2    7    8   -1   -4   -5     6     4    -3
 [7,]  -15  -14   -5  -13   -4   -3  -12  -15  -16    -5    -7   -14
 [8,]  -12  -11   -2  -10   -1    0   -9  -12  -13    -2    -4   -11
 [9,]   -4   -3    6   -2    7    8   -1   -4   -5     6     4    -3
[10,]  -13  -12   -3  -11   -2   -1  -10  -13  -14    -3    -5   -12
[11,]  -12  -11   -2  -10   -1    0   -9  -12  -13    -2    -4   -11
[12,]   -5   -4    5   -3    6    7   -2   -5   -6     5     3    -4
> median(diff)
[1] -4
> library("coin")
> wilcox_test(Grade ~ Sex,data= dat, conf.int= T,distribution = exact())

    Exact Wilcoxon-Mann-Whitney Test

data:  Grade by Sex (Boy, Girl)
Z = -2.3449, p-value = 0.01763
alternative hypothesis: true mu is not equal to 0
95 percent confidence interval:
 -10  -1
sample estimates:
difference in location 
                    -4 

equality of medians if the distributions are symmetric and of same scale parameter

AntoineSoetewey commented 4 years ago

Thanks for pointing it out! So if I understand correctly it's a bit like the Student's t-test for paired samples which uses the mean of the differences between samples x and y. I have corrected the article.