ProjectMOSAIC / mosaic

Project MOSAIC R package
http://mosaic-web.org/
93 stars 26 forks source link

statTally produces unexpected 1-sided results #733

Closed cbaldassano closed 5 years ago

cbaldassano commented 5 years ago

Hello, I have been using mosaic in my intro stats class and in general have been very happy with it. However I've hit one issue that has been causing a lot of confusion for students. When statTally is called with a sample statistic that is in the opposite direction from the alternative hypothesis in a 1-sided test, it flips the value around the center in an unexpected way.

For example, the code

rdata <- data.frame("null" = rnorm(1000))
statTally(-1, rdata, alternative="greater")

produces

Of the 1001 samples (1 original + 1000 random),

    1 ( 0.1 % ) had test stats = -1

    183 ( 18.28 % ) had test stats >= 1

This is quite counterintuitive to both my students and myself. The correct 1-sided p value here should count the number of null samples >= -1, not >= 1.

The flipping seems to occur in lines 134-137 of statTally.R:

hi <- center + abs(dstat - center)
lo <- center - abs(dstat - center)
if (alternative == 'greater') lo <- -Inf
if (alternative == 'less')    hi <-  Inf

I would argue that in the 1-sided cases, the dstat should not be flipped around the center. So a possible fix would be

if (alternative == 'two.sided') {
   hi <- center + abs(dstat - center)
   lo <- center - abs(dstat - center)
}
if (alternative == 'greater') {
   hi <- dstat
   lo <- -Inf
}
if (alternative == 'less') {
   hi <-  Inf
   lo <- dstat
}
cbaldassano commented 5 years ago

I've added a suggested pull request to fix this issue, along with a new test function for statTally - please let me know your feedback.

rpruim commented 5 years ago

Thanks @cbaldassano

rpruim commented 5 years ago

PR has been merged into beta branch.