Merck / simtrial

Clinical trial simulation for time-to-event endpoints
https://merck.github.io/simtrial/
GNU General Public License v3.0
17 stars 8 forks source link

A potential bug of parallel computation in `sim_gs_n` #260

Closed LittleBeannie closed 3 months ago

LittleBeannie commented 3 months ago

The following lines of codes gives me an error message when I run it parallelly. However, the sequential processing is error-free.

ratio <- 1

enroll_rate <- gsDesign2::define_enroll_rate(duration = c(2, 2, 8),
                                             rate = c(1, 2, 3))

fail_rate <- gsDesign2::define_fail_rate(duration = c(4, Inf), 
                                         fail_rate = log(2) / 12,
                                         hr = c(1, .6),
                                         dropout_rate = .001)

alpha <- 0.025
beta <- 0.1

upper <- gsDesign2::gs_spending_bound
upar <- list(sf = gsDesign::sfLDOF, total_spend = alpha)
test_upper <- rep(TRUE, 2)

lower <- gsDesign2::gs_spending_bound
lpar <- list(sf = gsDesign::sfLDOF, total_spend = beta)
test_lower <- c(TRUE, FALSE)
binding <- FALSE

info_frac = NULL
analysis_time = c(24, 36)

x <- gsDesign2::gs_design_ahr(enroll_rate = enroll_rate, fail_rate = fail_rate, 
                              alpha = alpha, beta = beta, ratio = ratio,
                              info_frac = info_frac, analysis_time = analysis_time, 
                              upper = upper, upar = upar, test_upper = test_upper,
                              lower = lower, lpar = lpar, test_lower = test_lower,
                              binding = binding) |> gsDesign2::to_integer()

ia_cut <- simtrial::create_cut(planned_calendar_time = x$analysis$time[1])
fa_cut <- simtrial::create_cut(planned_calendar_time = x$analysis$time[2])

future::plan("multisession", workers = 8)
simtrial::sim_gs_n(
  n_sim = 1e2,
  sample_size = x$analysis$n[2],
  enroll_rate = x$enroll_rate,
  fail_rate = x$fail_rate,
  test = simtrial::wlr,
  cut = list(ia = ia_cut, fa = fa_cut),
  weight = simtrial::fh(rho = 0, gamma = 0))
future::plan("sequential")

Following is the error message.

Using 8 cores with backend multisession
Error in eval(expr, p) : object 'x' not found
In addition: There were 50 or more warnings (use warnings() to see the first 50)
> warnings()
Warning messages:
1: In input_check_scalar(planned_calendar_time) :
  restarting interrupted promise evaluation

Hello @jdblischak and @cmansch , as both of you are the developers of sim_gs_n, I am including both of you in this discussion.

jdblischak commented 3 months ago

The problem is related to passing the arguments to create_cut() as variables:

ia_cut <- simtrial::create_cut(planned_calendar_time = x$analysis$time[1])
fa_cut <- simtrial::create_cut(planned_calendar_time = x$analysis$time[2])

When I replace them with their actual value, it works:

ia_cut <- simtrial::create_cut(planned_calendar_time = 23.82414)
fa_cut <- simtrial::create_cut(planned_calendar_time = 35.85303)

I believe this has to do with the complex feature of create_cut(), in that it is delaying evaluation of its arguments until the function is creates is executed. My initial idea was to use force(), but that didn't work. I'll keep investigating.

ia_cut <- simtrial::create_cut(planned_calendar_time = force(x$analysis$time[1]))
fa_cut <- simtrial::create_cut(planned_calendar_time = force(x$analysis$time[2]))
cmansch commented 3 months ago

I believe this is due to local/global issues with how foreach and doFuture work together. When foreach uses a parallel evaluation, global variables are not always accessible due to the instances that doFuture creates. This issue can be avoided if we explicitly pass the cut function to each instance of the loop. We currently use this for the test variable as well.

jdblischak commented 3 months ago

I was using force() incorrectly above. It needs to be called inside the function (source: Advanced R v1 chp 6 on functions). Updating create_cut() to force its arguments appears to fix the problem, but I'm still testing:

create_cut <- function(...) {
  lapply(X = list(...), FUN = force)
  function(data) {
    get_analysis_date(data, ...)
  }
}
cmansch commented 3 months ago

I have tested the updated create_cut() and am in alignment with this change as fixing the issue.

nanxstats commented 3 months ago

@jdblischak I did a functional test by running Yujie's example code above using the current simtrial version in the main branch and saw the same error reported by macOS runners in #262 revealed by tests.

remotes::install_version("gsDesign", "3.6.3")
remotes::install_version("gsDesign2", "1.1.2")
remotes::install_github("Merck/simtrial@c037f8f")
> future::plan("multisession", workers = 8)
> simtrial::sim_gs_n(
+   n_sim = 1e2,
+   sample_size = x$analysis$n[2],
+   enroll_rate = x$enroll_rate,
+   fail_rate = x$fail_rate,
+   test = simtrial::wlr,
+   cut = list(ia = ia_cut, fa = fa_cut),
+   weight = simtrial::fh(rho = 0, gamma = 0)
+ )
Using 8 cores with backend multisession
Error in `[.data.table`(x, , `:=`(enroll_time, rpwexp_enroll(n, enroll_rate))) : 
  Supplied 454 items to be assigned to 453 items of column 'enroll_time'. If you wish to 'recycle' the RHS please use rep() to make this intent clear to readers of your code.
jdblischak commented 3 months ago

I did a functional test by running Yujie's example code above using the current simtrial version in the main branch and saw the same error reported by macOS runners in https://github.com/Merck/simtrial/pull/262 revealed by tests.

Thanks @nanxstats for confirming! While it's frustrating that this is still broken for macOS users, the good news is that the test I added in #262 appears to be identifying an actual problem that will affect end users.

As an experiment, could you please re-run your test above but install {simtrial} from 23449e00e53af89330ec63d4b3b807fe41129311, ie prior to #261? I'm curious if you'll receive the {data.table} error or the object 'x' not found error.

nanxstats commented 3 months ago

@jdblischak Yep, the same data.table error already exists under https://github.com/Merck/simtrial/commit/23449e00e53af89330ec63d4b3b807fe41129311, under macOS, for both sequential and parallel:

remotes::install_github("Merck/simtrial@23449e0")
Using 8 cores with backend multisession
Error in `[.data.table`(x, , `:=`(enroll_time, rpwexp_enroll(n, enroll_rate))) :
  Supplied 454 items to be assigned to 453 items of column 'enroll_time'. If you wish to 'recycle' the RHS please use rep() to make this intent clear to readers of your code.
In addition: There were 33 warnings (use warnings() to see them)
Backend uses sequential processing.
Loading required package: foreach
Loading required package: future
Error in `[.data.table`(x, , `:=`(enroll_time, rpwexp_enroll(n, enroll_rate))) :
  Supplied 454 items to be assigned to 453 items of column 'enroll_time'. If you wish to 'recycle' the RHS please use rep() to make this intent clear to readers of your code.
nanxstats commented 3 months ago

It looks like different issues were triggered when the random seed is different. In the beginning of the example, use a set.seed(42) gives the object 'x' not found error. Use a set.seed(44) gives the data.table error.

set.seed(42)

Using 8 cores with backend multisession
Error in eval(expr, p) : object 'x' not found
In addition: There were 36 warnings (use warnings() to see them)

set.seed(44)

Using 8 cores with backend multisession
Error in `[.data.table`(x, , `:=`(enroll_time, rpwexp_enroll(n, enroll_rate))) :
  Supplied 454 items to be assigned to 453 items of column 'enroll_time'. If you wish to 'recycle' the RHS please use rep() to make this intent clear to readers of your code.
In addition: There were 41 warnings (use warnings() to see them)
jdblischak commented 3 months ago

Yep, the same data.table error already exists under https://github.com/Merck/simtrial/commit/23449e00e53af89330ec63d4b3b807fe41129311, under macOS, for both sequential and parallel:

Thanks for confirming. That is a useful clue. This {data.table} error was not introduced by #261, but instead has always existed.

It looks like different issues were triggered when the random seed is different. In the beginning of the example, use a set.seed(42) gives the object 'x' not found error. Use a set.seed(44) gives the data.table error.

That's not a good sign for troubleshooting. If there is some sort of race condition between the two errors, this is going to be hard to isolate, fix, and test.