Closed dwachsmuth closed 4 years ago
Hi. What's your sessionInfo()
and are you running this in the terminal or some gui such as RStudio?
Session info is pasted below. I'm running RStudio 1.3.959.
> sessionInfo()
R version 4.0.0 (2020-04-24)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Catalina 10.15.5
Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib
locale:
[1] en_CA.UTF-8/en_CA.UTF-8/en_CA.UTF-8/C/en_CA.UTF-8/en_CA.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] foreach_1.5.0 future.apply_1.5.0 future_1.17.0 devtools_2.3.0 usethis_1.6.1
loaded via a namespace (and not attached):
[1] Rcpp_1.0.4.6 compiler_4.0.0 iterators_1.0.12 prettyunits_1.1.1 remotes_2.1.1 RPostgres_1.2.0
[7] tools_4.0.0 testthat_2.3.2 digest_0.6.25 pkgbuild_1.0.8 pkgload_1.1.0 bit_1.1-15.2
[13] memoise_1.1.0 pkgconfig_2.0.3 rlang_0.4.6 DBI_1.1.0 cli_2.0.2 rstudioapi_0.11
[19] parallel_4.0.0 xfun_0.14 withr_2.2.0 desc_1.2.0 fs_1.4.1 vctrs_0.3.0
[25] globals_0.12.5 hms_0.5.3 rprojroot_1.3-2 bit64_0.9-7 doFuture_0.9.0 glue_1.4.1
[31] listenv_0.8.0 R6_2.4.1 processx_3.4.2 fansi_0.4.1 sessioninfo_1.1.1 callr_3.4.3
[37] blob_1.2.1 magrittr_1.5 codetools_0.2-16 backports_1.1.7 ps_1.3.3 ellipsis_0.3.1
[43] assertthat_0.2.1 tinytex_0.23 doParallel_1.0.15 crayon_1.3.4
Since you're using multicore while running RStudio, you've must have explicitly re-enabled multicore processing. See ?future::supportsMulticore
for details on why it's disabled by default. There's also a link to RStudio folks saying that forked processing should be avoided when using RStudio.
I recommend that you try your code in a plain R terminal session without RStudio and see if that makes a difference. If it still crashes, it could also be that you're running out of memory; cor(x,y)
can be quite memory hungry. When you run out of memory and forked child processes dies, you get can get that kind of error you mentioning.
Hi Henrik,
The problem is exactly the same in the terminal. The "small" correlation completes every time, the incrementally larger one fails every time. (I'm aware of the potential issues with multicore processing in RStudio, but these kind of "dumb but large scale" operations have always been very stable for me, and the overhead of copying gigantic globals into PSOCK clusters almost completely cancels out the parallelization benefits.)
I'm also quite confident it's not a simple memory issue, since I'm doing this work on a computer with 384 GB of RAM, and I can run 32 threads of the small correlation from my reprex without any issues. profmem
suggests that the correlation allocates about 2 GB of memory for a single thread. (Output pasted below.) Monitoring RAM at the system level shows that R is using 100 GB of RAM at the peak when I've got 32 threads going simultaneously, so ~ 3GB per thread.
profmem::profmem({output <- cor(x_list[[1]], y_list_big[[1]])})
Rprofmem memory profiling of:
{
output <- cor(x_list[[1]], y_list_big[[1]])
}
Memory allocations:
Number of 'new page' entries not displayed: 3
what bytes calls
4 alloc 2147520048 cor()
5 alloc 80048 cor()
6 alloc 214800 cor()
7 alloc 40048 cor()
8 alloc 107424 cor()
total 2147962368
I should add that I ran into this problem in the course of package development, where the use case will frequently involve these very large matrix correlations, and I want users to be able to use {future} to speed things up.
So just knowing that the code works with plan(multisession)
isn't very helpful, since I guess I would have to detect a multicore future within the package and then conditionally disable multithreaded processing in that case, which would be a very poor user experience, given that the rest of the package works fine (and is indeed far faster) with multicore futures.
Another step towards narrowing down the source of the problem. Does it also crash if you call parallel::mcmapply()
? That should be the closest to what your code runs.
Also, try setting options(future.fork.multithreading.enable = FALSE)
. This will should disable multi-thread processing in your forked processes. Multi-threading and forked processing is also known to causes issues in R.
FYI, I don't think this problem is related to the future framework per se.
EDIT 2021-07-07: Fix typo; options()
and not option()
Ok, parallel::mcmapply()
produced the same error, which means you're right and the problem isn't related to {future}. It's still problematic for my package, though, because I can't assume what kind of plan users will be setting before running the function.
But probably the solution will just be to test for matrices above a certain size and split them preemptively.
Thanks for looking into this!
(Incidentally, I wasn't able to get the future.fork.multithreading.enable
option to work. I received an error saying disabling multithreading wasn't possible on my system.)
The future.fork.multithreading.enable
is a beta feature where I've been trying to introduce in an as robust way as possible. All it end up doing internally is to try force single-threaded processing by calling:
RhpcBLASctl::omp_set_num_threads(1L)
You could try to call that in your mcmapply()
function too.
However, there are probably better and more robust ways to disable multi-threaded processing in R, e.g. setting
export OMP_NUM_THREADS=1
before launching R. See https://github.com/HenrikBengtsson/future/issues/255 for other env vars that you could also set to 1
. After doing this, see if mcmapply()
still fails.
The real problem here is that you have something that is unstable, and my best guess is that it's due to using forked processing and multi-threading at the same time. This is never a good situation. Even if you can find a workaround on your local system that seems to avoid triggering problems you will never know if this is the case for others. There are so many things that can go on here and not understand what the real causes it, I would refrain from doing ad-hoc workarounds. They will come back and bite you or and end-user!
I'm leaning more and more to tell all developers and users to not use forked processing in R. Here is what the author of mclapply()
wrote in R-devel thread 'mclapply returns NULLs on MacOS when running GAM' (https://stat.ethz.ch/pipermail/r-devel/2020-April/079384.html) on 2020-04-28:
Do NOT use mcparallel() in packages except as a non-default option that user can set for the reasons Henrik explained. Multicore is intended for HPC applications that need to use many cores for computing-heavy jobs, but it does not play well with RStudio and more importantly you don't know the resource available so only the user can tell you when it's safe to use. Multi-core machines are often shared so using all detected cores is a very bad idea. The user should be able to explicitly enable it, but it should not be enabled by default.
I'm closing but feel free to comment further.
Many thanks for the additional information/feedback!
I want to be clear, though, that the code in my package simply calls future.apply::future_mapply()
, and it is through the process of using the in-development package for my lab's research that I discovered the issue with large matrices and multicore futures. And in fact the intention for our internal use is specifically to run our code on HPCs where multicore processing is fairly common.
In other words, I think I am following the recommended design pattern to use {future}--write the package with no assumptions about the type of plan a user will use. But it looks like the code will break if a user sets plan(multicore)
(which I have no control over) and happens to supply matrices of more than a certain size.
So I could leave the code as is, and maybe include a warning in the package docs that there's a known problem with multicore parallelism, or I could try to detect the (potentially rare?) combination of very large inputs and plan(multicore) and either fail more informatively or chunk the job into a larger number of smaller matrices.
Also, a quick update. Setting the OMP threads to 1 didn't change the problem: both future.apply::future_mapply
and parallel::mcmapply
return an error that one or more cores did not deliver results.
I only see now that your session info mentions BLAS, so then retry with
export OPENBLAS_NUM_THREADS=1
And obviously, make sure to troubleshoot in a fresh R session in the terminal; R --vanilla
Also, try setting
option(future.fork.multithreading.enable = FALSE)
. This will should disable multi-thread processing in your forked processes. Multi-threading and forked processing is also known to causes issues in R.
This solves quite some cases, and is very useful, but a tiny remark:
It is options()
not option()
I need to compute correlations between very large matrices, and am trying to parallelize the task using {future}. I have discovered that there is a certain matrix size which will reliably produce a "Failed to retrieve the result of MulticoreFuture" error when I use plan(multicore). I.e. 100% failure rate. The same exact task succeeds 100% of the time with plan(sequential) or plan(multisession).
I am using {future.apply}, but I've verified that the same issue is present with {doFuture} and {foreach}, which suggested to me that the issue might be with {future} itself.
Because the object sizes have to be quite large, the reprex is a little annoying, but here it is.
library(future)
library(foreach)
plan(multicore)
# Create list of big matrices
x_list <- list(
matrix(data = rnorm(10 * 10000), nrow = 10),
matrix(data = rnorm(10 * 10000), nrow = 10)
)
# Create other list of big matrices
y_list_small <- list(
matrix(data = rnorm(10 * 26843), nrow = 10),
matrix(data = rnorm(10 * 26843), nrow = 10)
)
# Works
result <- future.apply::future_mapply(cor, x_list, y_list_small, SIMPLIFY = FALSE)
# Create list of slightly bigger matrices
y_list_big <- list(
matrix(data = rnorm(10 * 26844), nrow = 10),
matrix(data = rnorm(10 * 26844), nrow = 10)
)
# Does not work
result <- future.apply::future_mapply(cor, x_list, y_list_big, SIMPLIFY = FALSE)
# Increasing the vector size doesn't change results
x_list_long <- list(
matrix(data = rnorm(50 * 10000), nrow = 50),
matrix(data = rnorm(50 * 10000), nrow = 50)
)
y_list_small_but_long <- list(
matrix(data = rnorm(50 * 26843), nrow = 50),
matrix(data = rnorm(50 * 26843), nrow = 50)
)
# Works
result <- future.apply::future_mapply(cor, x_list_long, y_list_small_but_long, SIMPLIFY = FALSE)
# Same problem in doFuture
doFuture::registerDoFuture()
result <- vector("list", 2)
foreach(i = 1:2) %dopar% {result[[i]] <- cor(x_list[[i]], y_list_big[[i]])}