ProjectMOSAIC / mosaic

Project MOSAIC R package
http://mosaic-web.org/
93 stars 26 forks source link

extend prop.test() to handle multiple methods for computing confidence intervals? #449

Closed rpruim closed 9 years ago

rpruim commented 9 years ago

Here are the args for stats::prop.test():

function (x, n, p = NULL, alternative = c("two.sided", "less", "greater"), 
     conf.level = 0.95, correct = TRUE) 

We could add method (or ci.method) with various methods. Possibilities include:

Some (all?) of these are done in the binom package, but I'm not a fan of the naming scheme (lots of functions named binom.foo where foo is a method or task. Wald and +4 are implemented in fastR with similarly unfortunate names.

These are familiar enough and simple enough that we could just reimplement.

rpruim commented 9 years ago

Thinking about this a little bit more, here are some other options to the design:

rpruim commented 9 years ago

I've created a function that inserts a Wald interval into the object returned from binom.test(). It should be easy to replicate for other methods. Now just need to decide on the API -- extractors or an argument to binom.test()?

> binom.test(30, 40) %>% wald()

    Exact binomial test (with Wald CI)

data:  x and n
number of successes = 30, number of trials = 40, p-value = 0.002221
alternative hypothesis: true probability of success is not equal to 0.5
95 percent confidence interval:
 0.6158104 0.8841896
sample estimates:
probability of success 
                  0.75 

> binom.test(30, 40)

    Exact binomial test

data:  x and n
number of successes = 30, number of trials = 40, p-value = 0.002221
alternative hypothesis: true probability of success is not equal to 0.5
95 percent confidence interval:
 0.5880380 0.8730852
sample estimates:
probability of success 
                  0.75 
rpruim commented 9 years ago

Agresti-Coull and Plus 4 are implemented as well:

> binom.test(30, 40) %>% plus4

    Exact binomial test (with Plus 4 CI)

data:  x and n
number of successes = 30, number of trials = 40, p-value = 0.002221
alternative hypothesis: true probability of success is not equal to 0.5
95 percent confidence interval:
 0.5956792 0.8588663
sample estimates:
probability of success 
                  0.75 

> binom.test(30, 40) %>% agresti

    Exact binomial test (with Agresti-Coull CI)

data:  x and n
number of successes = 30, number of trials = 40, p-value = 0.002221
alternative hypothesis: true probability of success is not equal to 0.5
95 percent confidence interval:
 0.5963877 0.8598015
sample estimates:
probability of success 
                  0.75 
rpruim commented 9 years ago

I've brought all of these into one function called update_ci() and added more functionality (1-sided tests). Next step is to decide whether to add a ci.method argument to binom.test(). I might be able to fix some sticky non-standard evaluation stuff that has always bothered me at the same time.

> HT <- binom.test( 30, 50, alt="gr", conf.level=.90)
> update_ci(HT, "score") 

    Exact binomial test (with Score CI)

data:  x and n
number of successes = 30, number of trials = 50, p-value = 0.1013
alternative hypothesis: true probability of success is greater than 0.5
90 percent confidence interval:
 0.4994722 1.0000000
sample estimates:
probability of success 
                   0.6 

> update_ci(HT, "wald") 

    Exact binomial test (with Wald CI)

data:  x and n
number of successes = 30, number of trials = 50, p-value = 0.1013
alternative hypothesis: true probability of success is greater than 0.5
90 percent confidence interval:
 0.5112115 1.0000000
sample estimates:
probability of success 
                   0.6 

> update_ci(HT, "agresti") 

    Exact binomial test (with Agresti-Coull CI)

data:  x and n
number of successes = 30, number of trials = 50, p-value = 0.1013
alternative hypothesis: true probability of success is greater than 0.5
90 percent confidence interval:
 0.5093406 1.0000000
sample estimates:
probability of success 
                   0.6 

> update_ci(HT, "plus4") 

    Exact binomial test (with Plus 4 CI)

data:  x and n
number of successes = 30, number of trials = 50, p-value = 0.1013
alternative hypothesis: true probability of success is greater than 0.5
90 percent confidence interval:
 0.5069023 1.0000000
sample estimates:
probability of success 
                   0.6 
rpruim commented 9 years ago

We should test my code against some other references to make sure I didn't glum up any of the formulas. Perhaps we can compare against the tools in the binom package, or @nicholasjhorton can run a comparison in SAS.

rpruim commented 9 years ago

I've now folded this all into binom.test:

> do.call(rbind, 
      Map( function(m) interval(binom.test(~sex, data=HELPrct, ci.method=m)), 
      m=c("Score", "W", "agresti", "pl")) )
        probability of success     lower     upper level
Score                0.2362031 0.1978173 0.2780728  0.95
W                    0.2362031 0.1970892 0.2753170  0.95
agresti              0.2362031 0.1993466 0.2774960  0.95
pl                   0.2362031 0.1994390 0.2775850  0.95