Closed renkun-ken closed 5 years ago
Thank you for reporting. Here is a smaller illustration of what I think you're reporting on and that clarifies that a
should really be a global variable:
library(future)
plan(cluster, workers = "localhost")
a <- 1
y1 <- future.apply::future_lapply(1, function(i) {
if (TRUE) a <- a + 1
a
})
y2 <- future.apply::future_lapply(1, function(i) {
if (FALSE) a <- a + 1
a
})
Using options(future.globals=TRUE)
, we can see that a
is identified as a global in the y1
case whereas it is not in the y2
case.
And, yes, in contrast, using bare-bone futures, we see that a
is identified as a global in both cases;
f1 <- future({
if (TRUE) a <- a + 1
a
})
v1 <- value(f1)
f2 <- future({
if (FALSE) a <- a + 1
a
})
v2 <- value(f2)
Now, what's odd is that:
y3 <- future.apply::future_lapply(1, function(i) {
b <- a
a <- a + 1
TRUE
})
fail to identify a
as a global, whereas:
y4 <- future.apply::future_lapply(1, function(i) {
b <- a
TRUE
})
works.
It looks related to using a <- a + 1
where RHS is a global variable whereas LHS is a local variable(*). I'll investigate. (There's something way back in my head that this is on a todo-list from before, but I don't trust my memory anymore). I'll flag it as a bug for now.
(*) It's highly recommend not to use such ambiguous constructs in parallel processing. This is related to the "reset" example show in https://cran.r-project.org/web/packages/future/vignettes/future-4-issues.html.
Thanks for pointing to the vignettes. I'll avoid such usage at the moment.
UPDATE: This has been fixed in the develop version of the globals package.
I've now also added package tests for future.apply that will test for this when globals (> 0.12.4) is released. I'm closing since there's nothing else to do in the future.apply package.
Thanks! I'll test it soon.
A note for future reference that I've been getting the same error
Error in ...future.FUN(...future.X_jj, ...) : object 'result' not found
occurring under a future_lapply
call to a function that internally creates (assigns) and then returns an object called result
.
The problem can be fixed by changing the object name inside the function to something other than result
.
I can't create a simple reproducible example, but I'm guessing the problem might be occurring because of a conflict with future's use of result
.
Thxs @geryan. Reproducible examples are always useful, so please share. BTW, does
remotes::install_github("HenrikBengtsson/globals@develop")
fix your problem?
Consider the following cases:
Case2-4 will end up in the following error:
It seems that if a global variable that is assigned again in a
FALSE
or non-determined condition, the global variable will not be exported to the worker.But the behavior is inconsistent with how
future
determines which global variables to export to workers.which produces the correct results without such error.