Closed nutterb closed 8 years ago
#' @name test_binomial
#' @export test_binomial
#' @import ParameterCheck
#'
#' @title Power and Sample Size Analysis for a Binomial Test
#' @description Determines the sample size, power, null proportion,
#' alternative proportion, or significance level for a binomial test.
#' The results also return the actual power and significance.
#'
#' @param n The sample size, or number of trials
#' @param p0 The value of the probability of a success under the null hypothesis
#' @param p1 The value of the probability of a success under the alternative hypothesis
#' @param power The power of the test
#' @param alpha Significance level for the test
#' @param alternative A character vector giving the alternative to the test.
#' Multiple values may be given, but the values must be \code{"two.tailed"},
#' \code{"left.tailed"}, or \code{"right.tailed"}.
#' @param conservative A logical vector. This determines if the sample
#' size selected is conservative (larger). This decision is required
#' because the power as a function of sample size is non-montonic in
#' the binomial distribution. In practice, it is usually better to look
#' at both and select the sample size where \code{alpha_actual} is closest
#' to \code{alpha}.
#' @param n_limits The limits of the search for when \code{n=NULL}. The
#' sample size is determined in a manner similar to \code{uniroot}, but
#' \code{uniroot} doesn't handle discrete values.
#'
#' @details Exactly one of the parameters \code{n}, \code{p0},
#' \code{p1}, \code{alpha}, and \code{power} must be passed as \code{NULL}. The only
#' exception is that \code{delta} may be passed as a second \code{NULL} when
#' \code{mu0} and \code{mu1} are specified.
#'
#' The parameters are combined via \code{expand.grid}, so all combinations
#' of the inputs are evaluated.
#'
#' @author Benjamin Nutter
#'
#' @references
#' O'Brien R, Castelloe J, "Sample-Size Analysis in Study Planning,"
#' American Statistical Association Continuing Education Program: Section on
#' Teaching Statistics in the Health Sciences, Joint Statistical Meetings,
#' San Francisco, CA; 5 August 2003 (Short Course Manual)
#'
#' Some design choices were obtained from the r-help question at:
#' \url{http://r.789695.n4.nabble.com/Sample-size-calculations-for-one-sided-binomial-exact-test-td3964313.html}
#'
#' @examples
#' #* Julia Chill's Frozen Sensations Example from O'Brien and Castelloe
#' test_binomial(n=c(20, 40), p0=.5, p1=.8, alpha=c(.01, .05),
#' alternative='right.tailed')
#'
#' #* Plot the sample size for a range of n
#' library(ggplot2)
#' Chill <- test_binomial(n=20:40, p0=.5, p1=.8, alpha=c(.01, .05),
#' alternative='right.tailed')
#'
#' ggplot(Chill, aes(x=n, y=power, colour=factor(alpha))) + geom_line()
#'
test_binomial <- function(n = NULL, p0 = NULL, p1 = NULL,
power = NULL, alpha=.05,
alternative = "two.tailed",
conservative = FALSE,
n_limits=c(2L, 200L)){
#******************************************
#* Parameter Checking
#* 1. Exactly one of n, p0, p1, power, and alpha may be NULL
#* 2. alpha must be between 0 and 1, exclusive
#* 3. power must be between 0 and 1, exclusive
#* 4. p0 must be between 0 and 1, exclusive
#* 5. p1 must be between 0 and 1, exclusive
#* 6. n must greater than 1
#* 7. alternative must be in c('two.sided', 'left.tailed', 'right.tailed')
#* 8. coerce n_limits to integer
#* 9. lower limit of n_limits must be greater than 1
#******************************************
Check <- ParameterCheck::newParamCheck()
#* 1. Exactly one of n, p0, p1, power, and alpha may be NULL
sum_null <- sum(sapply(list(n, p0, p1, power, alpha), checkmate::testNull))
Check <- ParameterCheck::addError(sum_null != 1,
"Exactly 1 of n, p0, p1, alpha, or power must be NULL",
Check)
#* 2. alpha must be between 0 and 1, exclusive
if (!is.null(alpha)){
Check <- ParameterCheck::addWarning(any(alpha <= 0) | any(alpha >= 1),
"'alpha' must be between 0 and 1, exclusive. Invalid values have been removed.",
Check)
alpha <- alpha[alpha > 0 & alpha < 1]
Check <- ParameterCheck::addError(length(alpha) == 0,
"No valid values were given for 'alpha'",
Check)
}
#* 3. power must be between 0 and 1, exclusive
if (!is.null(power)){
Check <- ParameterCheck::addWarning(any(power <= 0) | any(power >= 1),
"'power' must be between 0 and 1, exclusive. Invalid values have been removed.",
Check)
power <- power[power > 0 & power < 1]
Check <- ParameterCheck::addError(length(power) == 0,
"No valid values were given for 'power'",
Check)
}
#* 4. p0 must be between 0 and 1, exclusive
if (!is.null(p0)){
Check <- ParameterCheck::addWarning(any(p0 <= 0) | any(p0 >= 1),
"'p0' must be between 0 and 1, exclusive. Invalid values have been removed.",
Check)
p0 <- p0[p0 > 0 & p0 < 1]
Check <- ParameterCheck::addError(length(p0) == 0,
"No valid values were given for 'p0'",
Check)
}
#* 5. p1 must be between 0 and 1, exclusive
if (!is.null(p1)){
Check <- ParameterCheck::addWarning(any(p1 <= 0) | any(p1 >= 1),
"'p1' must be between 0 and 1, exclusive. Invalid values have been removed.",
Check)
p1 <- p1[p1 > 0 & p1 < 1]
Check <- ParameterCheck::addError(length(p1) == 0,
"No valid values were given for 'p1'",
Check)
}
#* 6. n must greater than 1
if (!is.null(n)){
Check <- ParameterCheck::addWarning(any(n <= 0),
"'n' must be larger than 1. Invalid values have been removed.",
Check)
n <- n[n > 1]
Check <- ParameterCheck::addError(length(n) == 0,
"No valid values were given for 'n'",
Check)
}
#* 7. alternative must be in c('two.tailed', 'left.tailed', 'right.tailed')
Check <- ParameterCheck::addError(!all(alternative %in% c("two.tailed", "left.tailed",
"right.tailed")),
paste0("Values in 'alternative' must be ",
"'two.tailed', 'left.tailed', or 'right.tailed'"),
Check)
#* 8. coerce n_limits to integer
Check <- ParameterCheck::addWarning(!checkmate::testInteger(n_limits),
"n_limits has been coerced to an integer vector",
Check)
#* 9. lower limit of n_limits must be greater than 1
n_limits <- sort(n_limits)
Check <- ParameterCheck::addWarning(n_limits[1] <= 1,
"The minimum for n_limits has been increased to 2",
Check)
if (n_limits[1] <= 1) n_limits[1] <- 2
n_limits <- n_limits[1]:n_limits[2]
#* Pass errors and warnings
ParameterCheck::finishParamCheck(Check)
#*****************************************
#* Output Data Frame
.params <- expand.grid(p0 = if (is.null(p0)) NA else p0,
p1 = if (is.null(p1)) NA else p1,
alpha = if (is.null(alpha)) NA else alpha,
alpha_actual = NA,
power = if (is.null(power)) NA else power,
power_actual = NA,
n = if (is.null(n)) NA else n,
alternative = alternative,
conservative = conservative,
stringsAsFactors=FALSE)
#* Power Equation Selection Function
power.eqn <- function(alt){
switch(alt,
"two.tailed" = quote({pbinom(qbinom(alpha/2, n, p0), n, p1) +
pbinom(qbinom(alpha/2, n, p0,
lower.tail=FALSE),
n, p1, lower.tail=FALSE)}),
"left.tailed" = quote({pbinom(qbinom(alpha, n, p0), n, p1)}),
"right.tailed" = quote({pbinom(qbinom(alpha, n, p0,
lower.tail=FALSE),
n, p1, lower.tail=FALSE)}))
}
sig.eqn <- function(alt){
switch(alt,
"two.tailed" = quote({pbinom(qbinom(alpha/2, n, p0) - 1, n, p0) +
pbinom(qbinom(alpha/2, n, p0, lower.tail=FALSE) + 1,
n, p0, lower.tail=FALSE)}),
"left.tailed" = quote({pbinom(qbinom(alpha, n, p0) - 1, n, p0)}),
"right.tailed" = quote({pbinom(qbinom(alpha, n, p0, lower.tail=FALSE) + 1,
n, p0, lower.tail=FALSE)}))
}
#* Calculate Power
if (is.null(power)){
for(i in 1:nrow(.params)){
.params$power[i] <- with(.params[i, ], eval(power.eqn(alternative)))
.params$alpha_actual[i] <- with(.params[i, ], eval(sig.eqn(alternative)))
}
.params$power_actual <- .params$power
}
#* Calculate Sample Size
#* Sample Size is calculated by calculating the difference between
#* the stated power and the power calculate for each value in n_limits.
#* The value of n that gives the smallest positive difference is the
#* chosen sample size.
if (is.null(n)){
for(i in 1:nrow(.params)){
.params$n[i] <-
with(.params[i, ],
{eqn <- function(n){eval(power.eqn(alternative)) - power}
if (conservative) n_limits[max(which(eqn(n_limits) <= 0)) + 1]
else n_limits[min(which(eqn(n_limits) >= 0))]
})
.params$alpha_actual[i] <- with(.params[i, ], eval(sig.eqn(alternative)))
.params$power_actual[i] <- with(.params[i, ], eval(power.eqn(alternative)))
#* This block is in place to help me observe fluctuations in power
#* based on sample size. Particularly when increasing sample size
#* decreases power
#* with(.params[i, ],
#* {eqn <- function(n){eval(power.eqn(alternative)) - power}
#* pwr <- function(n){eval(power.eqn(alternative))}
#* data.frame(n=n_limits,
#* power = pwr(n_limits),
#* diff = eqn(n_limits))
#* })
}
}
#* Solve for p0
if (is.null(p0)){
for(i in 1:nrow(.params)){
.params$p0[i] <-
with(.params[i, ],
uniroot(function(p0) eval(power.eqn(alternative)) - power,
c(0, 1))$root)
.params$alpha_actual[i] <- with(.params[i, ], eval(sig.eqn(alternative)))
.params$power_actual[i] <- with(.params[i, ], eval(power.eqn(alternative)))
}
}
#* Solve for p1
if (is.null(p1)){
for(i in 1:nrow(.params)){
.params$p1[i] <-
with(.params[i, ],
uniroot(function(p1) eval(power.eqn(alternative)) - power,
c(0, 1))$root)
.params$alpha_actual[i] <- with(.params[i, ], eval(sig.eqn(alternative)))
.params$power_actual[i] <- with(.params[i, ], eval(power.eqn(alternative)))
}
}
#* Solve for significance
if (is.null(alpha)){
for(i in 1:nrow(.params)){
.params$alpha[i] <-
with(.params[i, ],
uniroot(function(alpha) eval(power.eqn(alternative)) - power,
c(0, 1))$root)
.params$power_actual[i] <- with(.params[i, ], eval(power.eqn(alternative)))
}
.params$alpha_actual <- .params$alpha
}
return(.params)
}
Some example runs
#* Example with proper inputs
test_binomial(n=c(20, 40), p0=.5, p1=.8, alpha=c(.01, .05),
alternative='right.tailed')
#* Example with improper inputs
#* Returns 2 errors and 3 warnings
test_binomial(n=c(20, 40), p0=NULL, p1=.8, alpha=c(0, .01, .05),
alternative=c('right.tailed', 'one.tailed'),
n_limits=c(0, 105.3))
So the idea is to collect all messages from cm's ceck* calls in a buffer?
Until something like your finishParamCheck is called by the client programmer manually at the and of his (toplevel) function?
This might be doable. Michel is mainly leading this project so he should decide this.
In general my philosophy is that it would improve R if larger, high quality projects are organized in groups, instead of each person doing their own thing. But we need to figure out how much this changes the current structure.
If we re-use our current functions, that makes them a bit state-based, depending on an option. So it needs to be considered carefully.
We are also currently extremely busy with other stuff. Assuming that Michel would agree to the general idea would you be willing to work on this in a fork to test this out?
Until something like your finishParamCheck is called by the client programmer manually at the and of his (toplevel) function?
That is nonsense. the "finish" function must be called then here:
f(arg1, arg2) checkArg1 ---> collect checkArg2 ---> collect finishArgCheck ---------> If we pass this the basic function contract with the client user holds. We can start computing f.
Two options to design this are
startArgCheckBlock("f") --> starts collecting messages checkArg1 g(123) # now what happens here with the potential checks in g? checkArg2 endArgCheckBlock("f")
doArgCheckBlock({ --> collects messages in block checkArg1
foo = 1234 checkArg2 })
That's correct--the programmer would place a finishArgCheck
at the end of the parameter checking. If any messages are found, the function terminates to inform the user that changes to the arguments are needed.
Some better pseudo code (though I think you've got the idea now)
f <- function(arg1, arg2){
Initialize ArgCheck
checkArg1
checkArg2
Complete ArgCheck
do whatever f was supposed to do
}
Right now I'm just gathering my error and warning messages into a list. My finishParamCheck
merely calls warning
and stop
if it detects any messages in that list. It's a fairly simple scheme, though I can't speak to its efficiency.
I'd be happy to play with this in a fork if it is of interest.
startArgCheckBlock("f") --> starts collecting messages checkArg1 g(123) # now what happens here with the potential checks in g? checkArg2 endArgCheckBlock("f")
if g
is occurring within f
, I would argue that it is the responsibility of f
's programmer to ensure that arguments passed to g
are appropriate. So, if it were me, I would program argument checks in f
for anything that I would pass to g
. If anything failed to satisfy the demands of g
, I would terminate the function before calling g
.
If anything failed to satisfy the demands of g, I would terminate the function before calling g
But in your idea you propose here you terminate here:
endArgCheckBlock("f")
Which is after the call to g?
Also you need to consider the case that g itself does arg checks with checkmate.
Ah, yes. I see what you mean. And I have thought about this. Unfortunately, I don't have a solution any better than
startArgCheck --> starts collecting messages
checkArg1
endArgCheck --> Terminates if errors are found, continues if none are found
g(123) # now what happens here with the potential checks in g?
checkArg2
endArgCheck --> Terminates if errors are found, continues if none are found
I realize calling two endArgCheck
s seems to defeat the purpose of my proposal, but if checkArg2
absolutely depends on the output of g(123)
, I can't think of a better alternative. The best I can offer is that you may have reduced the number of times f
will terminate.
For example, in the binomial_test
function I shared above, there are 7 opportunities for the function to generate an error. If I absolutely had to run another function between checks 6 and 7, I've still reduce the number of stopping opportunities from 7 to 2, which increases the utility of each stopping point to the user.
in the case that g
is also using checks from checkmate
, I can see one of two things occurring.
f
to validate the arguments going to g
, ensuring that g
won't cast an error (avoiding the problem)g
casts an error that terminates f
and may or may not give any useful information for why.Option 1 is smart programming, and option 2 is no worse than the status quo (in fact, it is the status quo). So, I will admit, it isn't a perfect system, but I think it offers an improvement to the majority of functions (at least in what I'm doing. It may be quite different in other fields).
So summed up, again, we have two options:
I'd prefer option (1). The interface is a tad uglier but at least it is guaranteed to be thread safe. mclapply
would probably not play along with a global environment to store messages, and parLapply
would need support to forward options to the slave.
Two possibilities for the interface:
col = makeCollection()
checkNumeric(x, add = col)
makeReport(col, type = "stop", prefix = "Error in function foo")
col = character(0)
col = addCollection(col, checkNumeric(x))
makeReport(col, type = "stop", prefix = "Error in function foo")
The second is super simple:
addCollection = function(col, ...) {
c(col, Filter(Negate(isTRUE), list(...)))
}
Hi Michel:
A) Could you please indicate whether you like this direction in general? So whether you think this is a viable extension of checkmate? (Assuming that the structure does not becomes horrible and we find a good solution)
B) I agree that option "1." above looks a lot better (collector object vs. option state). Or that option "2." will create potential headaches. I worried already a bit about this "state-based thing". For "1." I don't see what can go wrong. It also does not look that ugly to me. (I would stick to our current checking style anyway for now in our projects)
C) I guess I like the first interface best from your stuff above.
I'll implement the first version of the interface for the next release.
This has been implemented in v1.6.3.
This makes my day! I'd forgotten you were working on this. I'll play around with it a bit in the near future. Thanks!
Would you consider adding some functionality that would allow all errors and warnings to be returned simultaneously, rather than stopping the function any time any error is produced? The model I'm trying to follow is from a SAS Global Forum presentation.
The basic concept is that any time an error or warning is produced, you record a note but continue on with the parameter checking. Then, when the parameter checks are complete, return all of the errors and warnings. That way, if the user has provided two (or more) flawed inputs, he or she can receive a message about all of them and fix them all before trying to repeat the function.
I have drafted an small package that handles these, but wondered if it might be a better fit within a package that already specializes in parameter checking (rather than add yet another package to CRAN). I'll include a sample function in a subsequent comment that illustrates what I'm trying to do.
If you're interested, great! If not, I'll publish to CRAN separately.