mllg / checkmate

Fast and versatile argument checks
https://mllg.github.io/checkmate/
Other
265 stars 30 forks source link

Functionality for returning all errors together #42

Closed nutterb closed 8 years ago

nutterb commented 9 years ago

Would you consider adding some functionality that would allow all errors and warnings to be returned simultaneously, rather than stopping the function any time any error is produced? The model I'm trying to follow is from a SAS Global Forum presentation.

The basic concept is that any time an error or warning is produced, you record a note but continue on with the parameter checking. Then, when the parameter checks are complete, return all of the errors and warnings. That way, if the user has provided two (or more) flawed inputs, he or she can receive a message about all of them and fix them all before trying to repeat the function.

I have drafted an small package that handles these, but wondered if it might be a better fit within a package that already specializes in parameter checking (rather than add yet another package to CRAN). I'll include a sample function in a subsequent comment that illustrates what I'm trying to do.

If you're interested, great! If not, I'll publish to CRAN separately.

nutterb commented 9 years ago
#' @name test_binomial
#' @export test_binomial
#' @import ParameterCheck
#' 
#' @title Power and Sample Size Analysis for a Binomial Test
#' @description Determines the sample size, power, null proportion,
#'   alternative proportion, or significance level for a binomial test.
#'   The results also return the actual power and significance.
#'   
#' @param n The sample size, or number of trials
#' @param p0 The value of the probability of a success under the null hypothesis
#' @param p1 The value of the probability of a success under the alternative hypothesis
#' @param power The power of the test
#' @param alpha Significance level for the test
#' @param alternative A character vector giving the alternative to the test.
#'   Multiple values may be given, but the values must be \code{"two.tailed"},
#'   \code{"left.tailed"}, or \code{"right.tailed"}.
#' @param conservative A logical vector.  This determines if the sample 
#'   size selected is conservative (larger).  This decision is required 
#'   because the power as a function of sample size is non-montonic in 
#'   the binomial distribution.  In practice, it is usually better to look
#'   at both and select the sample size where \code{alpha_actual} is closest 
#'   to \code{alpha}.
#' @param n_limits The limits of the search for when \code{n=NULL}.  The 
#'   sample size is determined in a manner similar to \code{uniroot}, but
#'   \code{uniroot} doesn't handle discrete values.
#'   
#' @details Exactly one of the parameters \code{n}, \code{p0},
#'   \code{p1}, \code{alpha}, and \code{power} must be passed as \code{NULL}.  The only 
#'   exception is that \code{delta} may be passed as a second \code{NULL} when
#'   \code{mu0} and \code{mu1} are specified.
#'   
#'   The parameters are combined via \code{expand.grid}, so all combinations
#'   of the inputs are evaluated.
#'   
#' @author Benjamin Nutter
#' 
#' @references
#' O'Brien R, Castelloe J, "Sample-Size Analysis in Study Planning,"
#'   American Statistical Association Continuing Education Program: Section on
#'   Teaching Statistics in the Health Sciences, Joint Statistical Meetings, 
#'   San Francisco, CA; 5 August 2003 (Short Course Manual)
#' 
#' Some design choices were obtained from the r-help question at:
#' \url{http://r.789695.n4.nabble.com/Sample-size-calculations-for-one-sided-binomial-exact-test-td3964313.html}
#' 
#' @examples
#' #* Julia Chill's Frozen Sensations Example from O'Brien and Castelloe
#' test_binomial(n=c(20, 40), p0=.5, p1=.8, alpha=c(.01, .05), 
#'               alternative='right.tailed')
#'               
#' #* Plot the sample size for a range of n
#' library(ggplot2)
#' Chill <- test_binomial(n=20:40, p0=.5, p1=.8, alpha=c(.01, .05), 
#'               alternative='right.tailed')
#'               
#' ggplot(Chill, aes(x=n, y=power, colour=factor(alpha))) + geom_line()
#' 

test_binomial <- function(n = NULL, p0 = NULL, p1 = NULL, 
                            power = NULL, alpha=.05,
                            alternative = "two.tailed",
                            conservative = FALSE,
                            n_limits=c(2L, 200L)){

  #******************************************
  #* Parameter Checking
  #* 1. Exactly one of n, p0, p1, power, and alpha may be NULL
  #* 2. alpha must be between 0 and 1, exclusive
  #* 3. power must be between 0 and 1, exclusive
  #* 4. p0 must be between 0 and 1, exclusive
  #* 5. p1 must be between 0 and 1, exclusive
  #* 6. n must greater than 1
  #* 7. alternative must be in c('two.sided', 'left.tailed', 'right.tailed')
  #* 8. coerce n_limits to integer
  #* 9. lower limit of n_limits must be greater than 1
  #******************************************

  Check <- ParameterCheck::newParamCheck()

  #* 1. Exactly one of n, p0, p1, power, and alpha may be NULL
  sum_null <- sum(sapply(list(n, p0, p1, power, alpha), checkmate::testNull))
  Check <- ParameterCheck::addError(sum_null != 1,
                    "Exactly 1 of n, p0, p1, alpha, or power must be NULL",
                    Check)

  #* 2. alpha must be between 0 and 1, exclusive
  if (!is.null(alpha)){
    Check <- ParameterCheck::addWarning(any(alpha <= 0) | any(alpha >= 1),
                        "'alpha' must be between 0 and 1, exclusive.  Invalid values have been removed.",
                        Check)
    alpha <- alpha[alpha > 0 & alpha < 1]
    Check <- ParameterCheck::addError(length(alpha) == 0,
                      "No valid values were given for 'alpha'",
                      Check)
  }

  #* 3. power must be between 0 and 1, exclusive
  if (!is.null(power)){
    Check <- ParameterCheck::addWarning(any(power <= 0) | any(power >= 1),
                        "'power' must be between 0 and 1, exclusive.  Invalid values have been removed.",
                        Check)
    power <- power[power > 0 & power < 1]
    Check <- ParameterCheck::addError(length(power) == 0,
                      "No valid values were given for 'power'",
                      Check)
  }

  #* 4. p0 must be between 0 and 1, exclusive
  if (!is.null(p0)){
    Check <- ParameterCheck::addWarning(any(p0 <= 0) | any(p0 >= 1),
                        "'p0' must be between 0 and 1, exclusive.  Invalid values have been removed.",
                        Check)
    p0 <- p0[p0 > 0 & p0 < 1]
    Check <- ParameterCheck::addError(length(p0) == 0,
                      "No valid values were given for 'p0'",
                      Check)
  }

  #* 5. p1 must be between 0 and 1, exclusive
  if (!is.null(p1)){
    Check <- ParameterCheck::addWarning(any(p1 <= 0) | any(p1 >= 1),
                        "'p1' must be between 0 and 1, exclusive.  Invalid values have been removed.",
                        Check)
    p1 <- p1[p1 > 0 & p1 < 1]
    Check <- ParameterCheck::addError(length(p1) == 0,
                      "No valid values were given for 'p1'",
                      Check)
  }

  #* 6. n must greater than 1
  if (!is.null(n)){
    Check <- ParameterCheck::addWarning(any(n <= 0),
                        "'n' must be larger than 1.  Invalid values have been removed.",
                        Check)
    n <- n[n > 1]
    Check <- ParameterCheck::addError(length(n) == 0,
                      "No valid values were given for 'n'",
                      Check)
  }

  #* 7. alternative must be in c('two.tailed', 'left.tailed', 'right.tailed')

  Check <- ParameterCheck::addError(!all(alternative %in% c("two.tailed", "left.tailed", 
                                            "right.tailed")),
                    paste0("Values in 'alternative' must be ",
                           "'two.tailed', 'left.tailed', or 'right.tailed'"),
                    Check)

  #* 8. coerce n_limits to integer
  Check <- ParameterCheck::addWarning(!checkmate::testInteger(n_limits),
                      "n_limits has been coerced to an integer vector",
                      Check)

  #* 9. lower limit of n_limits must be greater than 1
  n_limits <- sort(n_limits)
  Check <- ParameterCheck::addWarning(n_limits[1] <= 1,
                      "The minimum for n_limits has been increased to 2",
                      Check)
  if (n_limits[1] <= 1) n_limits[1] <- 2
  n_limits <- n_limits[1]:n_limits[2]

  #* Pass errors and warnings
  ParameterCheck::finishParamCheck(Check)

  #*****************************************
  #* Output Data Frame
  .params <- expand.grid(p0 = if (is.null(p0)) NA else p0,
                         p1 = if (is.null(p1)) NA else p1,
                         alpha = if (is.null(alpha)) NA else alpha,
                         alpha_actual = NA,
                         power = if (is.null(power)) NA else power,
                         power_actual = NA,
                         n = if (is.null(n)) NA else n,
                         alternative = alternative,
                         conservative = conservative,
                         stringsAsFactors=FALSE)

  #* Power Equation Selection Function
  power.eqn <- function(alt){
    switch(alt,
           "two.tailed" = quote({pbinom(qbinom(alpha/2, n, p0), n, p1) +
                                  pbinom(qbinom(alpha/2, n, p0, 
                                                lower.tail=FALSE), 
                                         n, p1, lower.tail=FALSE)}),
           "left.tailed" = quote({pbinom(qbinom(alpha, n, p0), n, p1)}),
           "right.tailed" = quote({pbinom(qbinom(alpha, n, p0,
                                                 lower.tail=FALSE), 
                                          n, p1, lower.tail=FALSE)}))
  }

  sig.eqn <- function(alt){
    switch(alt,
           "two.tailed" = quote({pbinom(qbinom(alpha/2, n, p0) - 1, n, p0) + 
                                 pbinom(qbinom(alpha/2, n, p0, lower.tail=FALSE) + 1,
                                        n, p0, lower.tail=FALSE)}),
           "left.tailed" = quote({pbinom(qbinom(alpha, n, p0) - 1, n, p0)}),
           "right.tailed" = quote({pbinom(qbinom(alpha, n, p0, lower.tail=FALSE) + 1,
                                          n, p0, lower.tail=FALSE)}))
  }

  #* Calculate Power
  if (is.null(power)){
    for(i in 1:nrow(.params)){
      .params$power[i] <- with(.params[i, ], eval(power.eqn(alternative)))
      .params$alpha_actual[i] <- with(.params[i, ], eval(sig.eqn(alternative)))
    }
    .params$power_actual <- .params$power

  }

  #* Calculate Sample Size
  #* Sample Size is calculated by calculating the difference between
  #* the stated power and the power calculate for each value in n_limits.
  #* The value of n that gives the smallest positive difference is the
  #* chosen sample size.
  if (is.null(n)){
    for(i in 1:nrow(.params)){
      .params$n[i] <- 
        with(.params[i, ],
             {eqn <- function(n){eval(power.eqn(alternative)) - power}
              if (conservative) n_limits[max(which(eqn(n_limits) <= 0)) + 1]
              else n_limits[min(which(eqn(n_limits) >= 0))]
             })
      .params$alpha_actual[i] <- with(.params[i, ], eval(sig.eqn(alternative)))
      .params$power_actual[i] <- with(.params[i, ], eval(power.eqn(alternative)))
      #* This block is in place to help me observe fluctuations in power
      #* based on sample size.  Particularly when increasing sample size 
      #* decreases power
      #* with(.params[i, ],
      #*      {eqn <- function(n){eval(power.eqn(alternative)) - power}
      #*       pwr <- function(n){eval(power.eqn(alternative))}
      #*             data.frame(n=n_limits,
      #*                        power = pwr(n_limits),
      #*                        diff = eqn(n_limits))
      #*      })
    }
  }

  #* Solve for p0
  if (is.null(p0)){
    for(i in 1:nrow(.params)){
      .params$p0[i] <- 
        with(.params[i, ],
             uniroot(function(p0) eval(power.eqn(alternative)) - power,
                     c(0, 1))$root)
      .params$alpha_actual[i] <- with(.params[i, ], eval(sig.eqn(alternative)))
      .params$power_actual[i] <- with(.params[i, ], eval(power.eqn(alternative)))
    }
  }

  #* Solve for p1
  if (is.null(p1)){
    for(i in 1:nrow(.params)){
      .params$p1[i] <- 
        with(.params[i, ],
             uniroot(function(p1) eval(power.eqn(alternative)) - power,
                     c(0, 1))$root)
      .params$alpha_actual[i] <- with(.params[i, ], eval(sig.eqn(alternative)))
      .params$power_actual[i] <- with(.params[i, ], eval(power.eqn(alternative)))
    }
  }

  #* Solve for significance
  if (is.null(alpha)){
    for(i in 1:nrow(.params)){
      .params$alpha[i] <- 
        with(.params[i, ],
             uniroot(function(alpha) eval(power.eqn(alternative)) - power,
                     c(0, 1))$root)
      .params$power_actual[i] <- with(.params[i, ], eval(power.eqn(alternative)))
    }
    .params$alpha_actual <- .params$alpha
  }

  return(.params)
}                       

Some example runs

#* Example with proper inputs
test_binomial(n=c(20, 40), p0=.5, p1=.8, alpha=c(.01, .05), 
              alternative='right.tailed')

#* Example with improper inputs
#* Returns 2 errors and 3 warnings
test_binomial(n=c(20, 40), p0=NULL, p1=.8, alpha=c(0, .01, .05),
              alternative=c('right.tailed', 'one.tailed'),
              n_limits=c(0, 105.3))
berndbischl commented 9 years ago

So the idea is to collect all messages from cm's ceck* calls in a buffer?

Until something like your finishParamCheck is called by the client programmer manually at the and of his (toplevel) function?

This might be doable. Michel is mainly leading this project so he should decide this.

In general my philosophy is that it would improve R if larger, high quality projects are organized in groups, instead of each person doing their own thing. But we need to figure out how much this changes the current structure.

If we re-use our current functions, that makes them a bit state-based, depending on an option. So it needs to be considered carefully.

We are also currently extremely busy with other stuff. Assuming that Michel would agree to the general idea would you be willing to work on this in a fork to test this out?

berndbischl commented 9 years ago

Until something like your finishParamCheck is called by the client programmer manually at the and of his (toplevel) function?

That is nonsense. the "finish" function must be called then here:

f(arg1, arg2) checkArg1 ---> collect checkArg2 ---> collect finishArgCheck ---------> If we pass this the basic function contract with the client user holds. We can start computing f.

berndbischl commented 9 years ago

Two options to design this are

startArgCheckBlock("f") --> starts collecting messages checkArg1 g(123) # now what happens here with the potential checks in g? checkArg2 endArgCheckBlock("f")

doArgCheckBlock({ --> collects messages in block checkArg1

this is often interspersed in my code, the assignment is now not visible outside of the block

foo = 1234 checkArg2 })

nutterb commented 9 years ago

That's correct--the programmer would place a finishArgCheck at the end of the parameter checking. If any messages are found, the function terminates to inform the user that changes to the arguments are needed.

Some better pseudo code (though I think you've got the idea now)

f <- function(arg1, arg2){
   Initialize ArgCheck
   checkArg1
   checkArg2
   Complete ArgCheck

   do whatever f was supposed to do
}

Right now I'm just gathering my error and warning messages into a list. My finishParamCheck merely calls warning and stop if it detects any messages in that list. It's a fairly simple scheme, though I can't speak to its efficiency.

I'd be happy to play with this in a fork if it is of interest.

nutterb commented 9 years ago

startArgCheckBlock("f") --> starts collecting messages checkArg1 g(123) # now what happens here with the potential checks in g? checkArg2 endArgCheckBlock("f")

if g is occurring within f, I would argue that it is the responsibility of f's programmer to ensure that arguments passed to g are appropriate. So, if it were me, I would program argument checks in f for anything that I would pass to g. If anything failed to satisfy the demands of g, I would terminate the function before calling g.

berndbischl commented 9 years ago

If anything failed to satisfy the demands of g, I would terminate the function before calling g

But in your idea you propose here you terminate here:

endArgCheckBlock("f")

Which is after the call to g?

berndbischl commented 9 years ago

Also you need to consider the case that g itself does arg checks with checkmate.

nutterb commented 9 years ago

Ah, yes. I see what you mean. And I have thought about this. Unfortunately, I don't have a solution any better than

startArgCheck --> starts collecting messages
checkArg1 
endArgCheck --> Terminates if errors are found, continues if none are found
g(123) # now what happens here with the potential checks in g?
checkArg2 
endArgCheck --> Terminates if errors are found, continues if none are found

I realize calling two endArgChecks seems to defeat the purpose of my proposal, but if checkArg2 absolutely depends on the output of g(123), I can't think of a better alternative. The best I can offer is that you may have reduced the number of times f will terminate.

For example, in the binomial_test function I shared above, there are 7 opportunities for the function to generate an error. If I absolutely had to run another function between checks 6 and 7, I've still reduce the number of stopping opportunities from 7 to 2, which increases the utility of each stopping point to the user.

in the case that g is also using checks from checkmate, I can see one of two things occurring.

  1. We program f to validate the arguments going to g, ensuring that g won't cast an error (avoiding the problem)
  2. g casts an error that terminates f and may or may not give any useful information for why.

Option 1 is smart programming, and option 2 is no worse than the status quo (in fact, it is the status quo). So, I will admit, it isn't a perfect system, but I think it offers an improvement to the majority of functions (at least in what I'm doing. It may be quite different in other fields).

mllg commented 9 years ago

So summed up, again, we have two options:

  1. Create an collector object and pass it around
  2. Change a state (i.e. an option) which influences the functions program flow to collect error messages instead of throwing them instantly.

I'd prefer option (1). The interface is a tad uglier but at least it is guaranteed to be thread safe. mclapply would probably not play along with a global environment to store messages, and parLapply would need support to forward options to the slave.

Two possibilities for the interface:

col = makeCollection()
checkNumeric(x, add = col)
makeReport(col, type = "stop", prefix = "Error in function foo")
col = character(0)
col = addCollection(col, checkNumeric(x))
makeReport(col, type = "stop", prefix = "Error in function foo")

The second is super simple:

addCollection = function(col, ...) {
  c(col, Filter(Negate(isTRUE), list(...)))
}
berndbischl commented 9 years ago

Hi Michel:

A) Could you please indicate whether you like this direction in general? So whether you think this is a viable extension of checkmate? (Assuming that the structure does not becomes horrible and we find a good solution)

B) I agree that option "1." above looks a lot better (collector object vs. option state). Or that option "2." will create potential headaches. I worried already a bit about this "state-based thing". For "1." I don't see what can go wrong. It also does not look that ugly to me. (I would stick to our current checking style anyway for now in our projects)

C) I guess I like the first interface best from your stuff above.

mllg commented 9 years ago

I'll implement the first version of the interface for the next release.

mllg commented 8 years ago

This has been implemented in v1.6.3.

nutterb commented 8 years ago

This makes my day! I'd forgotten you were working on this. I'll play around with it a bit in the near future. Thanks!