Closed maxim-h closed 11 months ago
Hi,
thanks for this. This looks specific to the future.apply package, for which I can reproduce it. I cannot reproduce this for bare-bone futures, furrr, or doFuture.
library(future)
plan(list(
outer = tweak(multisession, workers = 2),
inner = tweak(multisession, workers = 2)
))
options(future.globals.maxSize = 1234000)
f <- future({
outer <- data.frame(
label = "outer",
pid = Sys.getpid(),
maxSize = getOption("future.globals.maxSize", NA_real_)
)
f <- future({
data.frame(
label = "inner",
pid = Sys.getpid(),
maxSize = getOption("future.globals.maxSize", NA_real_))
})
inner <- value(f)
rbind(outer, inner)
})
v <- value(f)
print(v)
gives
label pid maxSize
1 outer 778647 1234000
2 inner 778763 1234000
which shows that the R option is indeed carried down (as it should by design);
stopifnot(all(v$maxSize == getOption("future.globals.maxSize")))
library(future.apply)
v <- future_lapply(1:2, FUN = function(x) {
outer <- data.frame(
label = "outer",
idx = x,
pid = Sys.getpid(),
maxSize = getOption("future.globals.maxSize", NA_real_)
)
inner <- future_lapply(3:4, FUN = function(x) {
data.frame(
label = "inner",
idx = x,
pid = Sys.getpid(),
maxSize = getOption("future.globals.maxSize", NA_real_))
})
inner <- do.call(rbind, inner)
rbind(outer, inner)
})
v <- do.call(rbind, v)
print(v)
gives:
label idx pid maxSize
1 outer 1 778647 NA
2 inner 3 778763 NA
3 inner 4 778764 NA
4 outer 2 778646 NA
5 inner 3 779485 NA
6 inner 4 779484 NA
library(furrr)
v <- future_map(1:2, function(x) {
outer <- data.frame(
label = "outer",
idx = x,
pid = Sys.getpid(),
maxSize = getOption("future.globals.maxSize", NA_real_)
)
inner <- future_map(3:4, function(x) {
data.frame(
label = "inner",
idx = x,
pid = Sys.getpid(),
maxSize = getOption("future.globals.maxSize", NA_real_))
})
inner <- do.call(rbind, inner)
rbind(outer, inner)
})
v <- do.call(rbind, v)
print(v)
gives
label idx pid maxSize
1 outer 1 778647 1234000
2 inner 3 778763 1234000
3 inner 4 778764 1234000
4 outer 2 778646 1234000
5 inner 3 779485 1234000
6 inner 4 779484 1234000
library(doFuture)
v <- foreach(x = 1:2) %dofuture% {
outer <- data.frame(
label = "outer",
idx = x,
pid = Sys.getpid(),
maxSize = getOption("future.globals.maxSize", NA_real_)
)
inner <- foreach(x = 3:4) %dofuture% {
data.frame(
label = "inner",
idx = x,
pid = Sys.getpid(),
maxSize = getOption("future.globals.maxSize", NA_real_))
}
inner <- do.call(rbind, inner)
rbind(outer, inner)
}
v <- do.call(rbind, v)
print(v)
gives
label idx pid maxSize
1 outer 1 778647 1234000
2 inner 3 778763 1234000
3 inner 4 778764 1234000
4 outer 2 778646 1234000
5 inner 3 779485 1234000
6 inner 4 779484 1234000
This has been fixed in the next version of future.apply. It was due to a typo; changing length(chunk)
to length(chunks)
in a few places fixed it. Doh!
FYI, future.apply 1.11.1, fixing this, is now on CRAN.
Thank you! Checked the new CRAN version. All works as expected.
Hi. And thank you for a great package!
future.globals.maxSize
option limits the amount of RAM that can be moved between processes. Very useful feature.But obviously sometimes you need to increase it like so:
This breaks once you define a nested plan. For example
The
future.globals.maxSize
seems to only apply to the outer plan, but not the inner one. This was reported here. I couldn't find anything in the documentation on how to deal with such situation.The only work around I could find is to define the option twice: once globally, and once inside the outer function that is being ran. Like this, for example:
This seems inconvenient and error prone. Is there a better method of defining these options for nested futures? If not, perhaps there should be.