Nested Futures

wfmueller29 commented 3 years ago

Hi Henrik,

Thank you for the awesome package. It is such an elegant and powerful way to parallelize. I have a question regarding nested parallelization. I am developing some functions which can be simplified to fun_a below.

fun_a <- function(x){
  a <- listenv()
  for(i in 1:x){
    a[[i]] <- future(Sys.sleep(5))
  a <- as.list(a)

I can then call the function sequentially while the function itself is processed asynchronously without any problems:

ncpus <- availableCores()
cl <- makeClusterPSOCK(ncpus)
plan(cluster, workers = cl)
b <- fun_a(10)
c <- fun_a(15)
e <- fun_a(13)

However it would be great if I could also run calls b, c, and e in parallel. I am trying to do this as shown below:

plan(list(tweak(cluster, workers = 3), tweak(cluster, workers = (length(cl) - 3) %/% 3)))

b %<-% fun_a(10)
c %<-% fun_a(15)
e %<-% fun_a(13)

done <- as.list(value(b),

However I get the error in your Common Issues Vignette:

Error: Invalid usage of futures: A future (here ‘ClusterFuture’) whose value has not yet been collected can only be queried 
by the R process (d3bd14ca-0e69-51f1-57df-6ff5ebd64ed4; pid 50037 on localhost) that created it, not by any other R 
processes (b8240ee6-1935-40ed-2b69-d177edf0b624; pid 48446 on localhost): Sys.sleep(5)

HenrikBengtsson commented 3 years ago

You can't pass futures between processes, so you can't create (future()) them in a parallel worker and ask them to be resolved (value()) in another. Instead, use:

fun_a <- function(x){
  a <- list()
  for(i in 1:x){
    a[[i]] <- future(Sys.sleep(5))

or, equivalently using implicit future assignments syntax:

fun_a <- function(x){
  a <- listenv()
  for(i in 1:x){
    a[[i]] %<-% Sys.sleep(5)

See also https://future.futureverse.org/articles/future-3-topologies.html