When using fork process for parallel computing like parallel::mclapply, I'm not sure if there's a necessity to check if the object size of globals is too big since fork's copy-on-write mechanism is exactly designed for this.
library(future)
x <- rnorm(100000000)
plan(multicore, workers = 4, earlySignal = TRUE)
system.time(res <- future.apply::future_lapply(1:50, function(i) {
sum(x) * i
}))
Error in getGlobalsAndPackages(expr, envir = envir, globals = globals) :
The total size of the 2 globals that need to be exported for the future expression ('FUN()') is 762.94 MiB. This exceeds the maximum allowed size of 500.00 MiB (option 'future.globals.maxSize'). There are two globals: 'x' (762.94 MiB of class 'numeric') and 'FUN' (4.72 KiB of class 'function').
Backtrace:
1: stop(msg)
2: getGlobalsAndPackages(expr, envir = envir, globals = globals)
3: getGlobalsAndPackagesXApply(FUN = FUN, args = args, MoreArgs = MoreArgs,
4: future_xapply(FUN = FUN, nX = nX, chunk_args = X, args = list(...),
5: future.apply::future_lapply(1:50, function(i) {
6: system.time(res <- future.apply::future_lapply(1:50, function(i) {
Timing stopped at: 0.031 0 0.031
In this case, I have to explicitly disable detection of globals to make it work:
BTW, I also notice that future.apply::future_lapply is significantly slower than parallel::mclapply in this simple case with the same number of workers:
When using fork process for parallel computing like
parallel::mclapply
, I'm not sure if there's a necessity to check if the object size of globals is too big since fork's copy-on-write mechanism is exactly designed for this.In this case, I have to explicitly disable detection of globals to make it work:
BTW, I also notice that
future.apply::future_lapply
is significantly slower thanparallel::mclapply
in this simple case with the same number of workers:timing shows:
Not sure if I'm missing something?