HenrikBengtsson / doFuture

:rocket: R package: doFuture - Use Foreach to Parallelize via Future Framework
https://doFuture.futureverse.org
84 stars 6 forks source link

%dofuture% can't get the correct environment when it was written in a function #82

Open chhtwhc opened 10 months ago

chhtwhc commented 10 months ago

When I write %dofuture% inside a self-defined function, it can only get variables from global environment. Those variables from the local environment of the self-defined function are ignored.

Here is the example:

myFunction1 = function(x){
  y = x + 1
  return(y)
} 
myFunction2 = function(x, y){
  z = x + y
  return(z)
}
myFunction3 = function(function_var4, function_var5, function_var6){

  # Claim some local variables
  local_var1 = vector("list", length = ncol(function_var4))
  local_var2 = vector("list", length = ncol(function_var4))
  local_var3 = function_var5 %>% pull(function_var6)
  local_var4 = data.frame(Var = seq(min(local_var3), max(local_var3), length.out = 10000))

  # Do some parallel calculation
  plan(multisession, workers = parallel::detectCores() - 2)

  foreach (i = 1:ncol(function_var4)) %dofuture% {

    data_glm = data.frame(Var = local_var3,
                          PreAbs = function_var4[,i])

    mod_glm = glm(PreAbs ~ poly(Var, 3), family = binomial, data = data_glm)

    # Result of the calculation
    local_local_var1 = predict(mod_glm, newdata = local_var4, se = F, type = "response")

    # Some simple calculation using local_local_var1
    # Save the result
    local_var1[[i]] = mean(local_local_var1)  # <---- I guess this cause the error
    local_var2[[i]] = myFunction1(local_local_var1)    
  }

  # Close multisession workers by switching plan
  plan(sequential)

  local_var5 =  myFunction2(local_local_var1, function_var5)

  return(list(opt = unlist(local_var1),
              nw = unlist(local_var2),
              miv = unlist(local_var5)))
}

I Get the error message Error in eval(quote({ : object 'local_var1' not found when run the code below:

library(foreach)
library(doFuture)
library(dplyr)

global_var1 = matrix(sample(c(0, 1), size = 10000, replace = T), ncol = 100) %>%
  as.data.frame()
global_var2 = data.frame(C1 = rnorm(100))
global_var3 = "C1"

result = myFunction3(function_var4 = global_var1, function_var5 = global_var2, function_var6 = global_var3)

I also posted the question on stack overflow: https://stackoverflow.com/questions/77772436/r-variable-scoping-issue-of-parallel-calculation-dofuture-in-a-nested-functi