philchalmers / SimDesign

Structure for organizing Monte Carlo simulations in R
http://philchalmers.github.io/SimDesign/
61 stars 18 forks source link

Is it possible to capture all errors when using multiple analysis functions? #20

Closed marklhc closed 2 years ago

marklhc commented 3 years ago

Hi Phil, I have another question regarding using multiple analysis functions. In a current project I want to compare the convergence rates of different methods, and since SimDesign draws a new sample when one of the analysis functions throws an error. My understanding is that currently SimDesign will stop a replication as soon as the first error was found, which means that the convergence rate of each analysis depends on the ordering of the method, as in the following example:

library(SimDesign)
# SimFunctions(comments=FALSE)

Design <- createDesign(N = c(10, 20, 30))

Generate <- function(condition, fixed_objects = NULL) {
  ret <- with(condition, if (runif(1) > 0.8) rep(NA, N) else rnorm(N))
  ret
}

Analyse.a1 <- function(condition, dat, fixed_objects = NULL) {
  if (any(is.na(dat))) {
    stop("mean did not converge")
  }
  ret <- mean(dat)
  ret
}

Analyse.a2 <- function(condition, dat, fixed_objects = NULL) {
  if (any(is.na(dat))) {
    stop("median did not converge")
  }
  ret <- median(dat)
  ret
}

Summarise <- function(condition, results, fixed_objects = NULL) {
  ret <- c(bias = bias(results, 0))
  ret
}

result <- runSimulation(Design,
                        replications = 10, seed = rep(103, 3),
                        generate = Generate, 
                        analyse = list(a1 = Analyse.a1, a2 = Analyse.a2), 
                        summarise = Summarise
)
#> 
#> Design row: 1/3;   Started: Fri Oct 22 15:04:57 2021;   Total elapsed time: 0.00s 
#> 
#> Design row: 2/3;   Started: Fri Oct 22 15:04:57 2021;   Total elapsed time: 0.01s 
#> 
#> Design row: 3/3;   Started: Fri Oct 22 15:04:57 2021;   Total elapsed time: 0.02s
#> 
#> Simulation complete. Total execution time: 0.02s

# Switch the order of a1 and a2
result2 <- runSimulation(Design,
                        replications = 10, seed = rep(103, 3),
                        generate = Generate, 
                        analyse = list(a2 = Analyse.a2, a1 = Analyse.a1), 
                        summarise = Summarise
)
#> 
#> Design row: 1/3;   Started: Fri Oct 22 15:04:57 2021;   Total elapsed time: 0.00s 
#> 
#> Design row: 2/3;   Started: Fri Oct 22 15:04:57 2021;   Total elapsed time: 0.01s 
#> 
#> Design row: 3/3;   Started: Fri Oct 22 15:04:57 2021;   Total elapsed time: 0.01s
#> 
#> Simulation complete. Total execution time: 0.02s

attr(result, which = "ERROR")
#> # A tibble: 3 × 1
#>   `ERROR: .mean did not converge\n`
#>                               <dbl>
#> 1                                 1
#> 2                                 2
#> 3                                 4
attr(result2, which = "ERROR")
#> # A tibble: 3 × 1
#>   `ERROR: .median did not converge\n`
#>                                 <dbl>
#> 1                                   1
#> 2                                   2
#> 3                                   4

Created on 2021-10-22 by the reprex package (v2.0.1)

Would it be possible to add an option so that for a given replication, all functions in the analysis list will run, and then all errors will be recorded? I know I can treat method as another design factor but I want to have the same data sets when comparing the different analysis functions. What I previously have done is to use multiple tryCatch() inside the Analyse() functions, and threw an error if anyone of the analyses resulted in an error; when multiple functions resulted in errors I just chained the error messages together (and there may be better ways to do it). Curious what you think about that, and whether something like that can be done with multiple analysis functions.

Thank you very much!

Mark

philchalmers commented 3 years ago

Regarding the use of the package for your specific desideratum, the use of sharing the exact datasets across analysis methods is not currently possible for general design reasons. However, I wonder why it should be something that should be desired anyway.

Wouldn't it be more meaningful to present the proportion of converged results given the data generation conditions, rather than similar information conditional on specific seeds? Unless you're trying to identify the exact reason as to why one method succeeds over another for a given dataset (e.g., a cell counts dropped below .002% in a generated data condition) then I don't see the need of such an approach. Getting at this idea via your chaining try-catch combo actually seems quite reasonable to me in this case, but again only if you're looking for an idea as to why the convergence issue is occurring for multiple methods.

marklhc commented 3 years ago

HI @philchalmers, thanks a lot for your quick response! This is very helpful. Regarding the use of the same data set within the same condition, I'm following the suggestion by Skrondal (2000, p. 156, doi: 10.1207/S15327906MBR3502_1), which suggests that doing so will increase precision when comparing across methods/conditions. I agree that the difference is likely small in practice when the number of replications is large.

Another reason, which just so happens to be my case, is that the data generating process may be relatively expensive. In my case, I need to simulate data, compute factor scores, and then use different analysis functions with the same factor scores. With a single analysis function I can just include the factor score computation in it; however, with multiple analysis function, I can only think of including the factor score computation in the data generation function. There may be other ways I'm not aware of, but otherwise simulating different data sets would mean doing the factor score computation a few more times. It is not a lot of extra time anyway, but just an example.

Thanks again!

philchalmers commented 3 years ago

Hi Mark,

In principle I agree with Skrondal's point, though it is only applicable in cases where all estimators return a valid result anyway (otherwise, you'd have an unusual pattern of missingness between estimators that could be difficult to assume the rejected replications are MCAR). The same data replication idea is the core idea behind the basic SimDesign setup of one Generate-Analyse pairing, so you could fall back to that approach and abandon the isolated analysis function (there's no guarantee that either approach is optimal for a given simulation design, particularly when there is shared overhead).

I think placing the model estimation components to obtain the factor scores in the generate step is a reasonable one, albeit it a little clunky, so my general recommendation in your case is to fall back to the single generate-analyse pairing. Otherwise, some type of pre-processing analysis function would need to be included in the analyse stream (e.g., generate -> pre.analyse() -> shared objects passed to Analyse._____() -> summarise() ), where for example the fixed_objects/dat argument could temporarily collect the common information in the pre.analyse() block to be passed down to the remaining Analyse calls. But I worry that could be confusing to front-end users..... thoughts?

marklhc commented 3 years ago

Hi Phil,

Thanks for sharing your thoughts! I agree in my case I think it seems better to fall back to the single generate-analyse pairing. Adding the pre-processing step is an interesting idea, but it is not clear whether it has enough benefit compared to the single-generate-analyse approach, given the potential confusion and the changes needed to accommodate that on the front end. Maybe it'd be better to just provide a vignette for doing the single generate-analyse approach, and I'm happy to draft one once I figure everything's out, if it may be something of interest.

Thanks again!

philchalmers commented 3 years ago

Yes, I think the vignette approach is the most reasonable at this stage, and could probably be added to the wiki.

As an aside, I'm happy that the package still works in this context (even if sub-optimally). There's nothing worse than learning an entire R package just to discover something can't be done for some odd design reason ..... at least, that was my experience with a few other packages in the same simulation framework spirit as this one.

philchalmers commented 2 years ago

Hi Mark,

I've been mulling over this issue for a little while now, and decided that the default behaviour when using a list of analyse functions should automatically allow the execution of each function regardless of the results from the other independent analysis components. I was initially hesitant to do that as it can potentially add a lot of needless overhead (e.g., comparing an ML estimator versus MCMC; if the former fails then data should be re-drawn immediately until the ML criteria is satisfied, and only then move on to MCMC), but I realize that most of the overhead application are likely special cases and should be up to the user to decide the overhead cost (now via extra_options = list(try_all_analyse = FALSE)).

Using a modified version of your example above gives the following output:

library(SimDesign)
# SimFunctions(comments=FALSE)

Design <- createDesign(N = c(10, 20, 30))

generate <- function(condition, fixed_objects = NULL) {
    ret <- with(condition, rnorm(N))
    ret
}

Analyse.a1 <- function(condition, dat, fixed_objects = NULL) {
    whc <- sample(c(0, 1, 2, 3), 1, prob = c(.7, .20, .05, .05))
    if (whc == 0) {
        ret <- mean(dat)
    } else if (whc == 1) {
        ret <- t.test() # missing arguments
    } else if (whc == 2) {
        ret <- t.test("invalid") # invalid arguments
    } else if (whc == 3) {
        # throw error manually
        stop("Manual error thrown")
    }
    # manual warnings
    if (sample(c(TRUE, FALSE), 1, prob = c(.1, .9))) {
        warning("This warning happens rarely")
    }
    if (sample(c(TRUE, FALSE), 1, prob = c(.5, .5))) {
        warning("This warning happens much more often")
    }
    ret
}

Analyse.a2 <- function(condition, dat, fixed_objects = NULL) {
    whc <- sample(c(0, 1, 2, 3), 1, prob = c(.7, .20, .05, .05))
    if (whc == 0) {
        ret <- median(dat)
    } else if (whc == 1) {
        ret <- t.test() # missing arguments
    } else if (whc == 2) {
        ret <- t.test("invalid") # invalid arguments
    } else if (whc == 3) {
        # throw error manually
        stop("Manual error thrown")
    }
    ret
}

summarise <- function(condition, results, fixed_objects = NULL) {
    ret <- c(bias = bias(results, 0))
    ret
}

result <- runSimulation(Design,
                        replications = 1000, 
                        generate = generate, 
                        analyse = list(a1 = Analyse.a1, a2 = Analyse.a2), 
                        summarise = summarise)
result

# A tibble: 3 x 9
      N  bias.a1   bias.a2 REPLICATIONS SIM_TIME COMPLETED                      SEED ERRORS WARNINGS
  <dbl>    <dbl>     <dbl>        <int>    <dbl> <chr>                         <int>  <int>    <int>
1    10  0.00184  0.00421          1000     2.73 Fri Nov 12 12:02:43 2021 1842658107   1092      603
2    20 -0.00720 -0.00301          1000     2.80 Fri Nov 12 12:02:46 2021  621640645   1015      618
3    30  0.00649  0.000657         1000     2.89 Fri Nov 12 12:02:48 2021  163788864    997      618

SimExtract(result, 'errors')

   N
1 10
2 20
3 30
  ERROR:  2 INDEPENDENT ERRORS THROWN:   a1.ERROR:   Error in t.test.default("invalid") : not enough 'x' observations\n  a2.ERROR:   Error in t.test.default("invalid") : not enough 'x' observations\n\n
1                                                                                                                                                                                                       4
2                                                                                                                                                                                                       2
3                                                                                                                                                                                                       6

<CLIP...>

Much like your previous strategy, the errors/warnings are combined into a unified error and counted (where applicable). Makes the error list a little less readable in its naming convention, but then the problem is more complex so that's the consequence I suppose. Of course, this doesn't quite help your situation 100%, but design-wise I think this behaviour is more appropriate than the previous stop-on-first-error setup. Let me know what you think before closing this issue off.

marklhc commented 2 years ago

Hi Phil,

Thank you so much and this is very helpful for my situation. I can do some quick post-processing to compute the error rates with, for instance, "a1.ERROR" and "a2.ERROR" in this example, like below

err <- SimExtract(result, 'errors')
# Total errors for each method
cbind(
    err[1],
    a1 = rowSums(err[grep("a1.ERROR", names(err))]),
    a2 = rowSums(err[grep("a2.ERROR", names(err))])
)
#>    N   a1  a2
#> 1 10 1374 303
#> 2 20 1272 270
#> 3 30 1271 289

Just to confirm, is the default behaviors now to stop as soon as an error occurs when Analyse is one function, but to capture the first errors from all functions when Analyse is a list of functions?

Thanks again! The package is great. To your point earlier, I've tried different ways to streamline simulation codes, and this is by far the best I've used.

Mark

philchalmers commented 2 years ago

Just to confirm, is the default behaviors now to stop as soon as an error occurs when Analyse is one function, but to capture the first errors from all functions when Analyse is a list of functions?

The new default is that when Analyse is a list of functions it will execute and collect all independent error messages from each function execution. Of course, if 1 or more errors are observed then a new dataset will be generated until R = replications valid datasets are successfully analysed by each function in the Analyse list. Does that help? It sounds like we're saying the same thing.

marklhc commented 2 years ago

Yes it does. Thank you Phil!

On 11/13/21 3:00 PM, Phil Chalmers wrote:

Just to confirm, is the default behaviors now to stop as soon as
an error occurs when Analyse is one function, but to capture the
first errors from all functions when Analyse is a list of functions?

The new default is that when |Analyse| is a list of functions it will execute and collect all independent error messages from each function execution. Of course, if 1 or more errors are observed than a new dataset will be regenerated until R = |replications| valid datasets are successfully analysed by each function in the |Analyse| list. Does that help?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/philchalmers/SimDesign/issues/20#issuecomment-968160522, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABPDUZHYCZWHYDZ4C7ZZDI3UL3UXXANCNFSM5GRPCSMA. Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.