Closed waynelapierre closed 3 years ago
It shouldn't since forked processing is unreliable in many GUIs including RStudio. This is what I get in RStudio 1.4.1717 with R 4.1.0 on Ubuntu 18.04:
> parallelly::supportsMulticore()
[1] FALSE
> parallelly:::supportsMulticoreAndRStudio()
[1] FALSE
> sessionInfo()
R version 4.1.0 Patched (2021-06-26 r80566)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 18.04.5 LTS
Matrix products: default
BLAS: /home/hb/software/R-devel/R-4-1-branch/lib/R/lib/libRblas.so
LAPACK: /home/hb/software/R-devel/R-4-1-branch/lib/R/lib/libRlapack.so
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8
[6] LC_MESSAGES=en_US.UTF-8 LC_PAPER=en_US.UTF-8 LC_NAME=C LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
loaded via a namespace (and not attached):
[1] compiler_4.1.0 parallelly_1.26.1 startup_0.15.0 parallel_4.1.0 tools_4.1.0
Here's to check if you actually run in parallel workers or not. You should get one unique PID per worker. I only get one in RStudio:
> library(future.apply)
> plan(sequential)
> future_sapply(1:nbrOfWorkers(), function(i) c(i = i, pid = Sys.getpid()))
[,1]
i 1
pid 9588>
> plan(multicore)
> future_sapply(1:nbrOfWorkers(), function(i) c(i = i, pid = Sys.getpid()))
[,1]
i 1
pid 9588
> plan(multicore, workers = 3)
>
> future_sapply(1:nbrOfWorkers(), function(i) c(i = i, pid = Sys.getpid()))
[,1] [,2] [,3]
i 1 2 3
pid 9588 9588 9588
Warning message:
In supportsMulticoreAndRStudio(...) :
[ONE-TIME WARNING] Forked processing ('multicore') is not supported when running R from RStudio because it is considered unstable. For more details, how to control forked processing or not, and how to silence this warning in future R sessions, see ?parallelly::supportsMulticore
As you see, all run in the same process (PID) as sequential.
Thanks for the clarification. I can replicate your results on my computer. Looking at my system monitor, I see most threads having high usage when I use future_Map and plan(multicore). Could that still be sequential instead of using multiple cores?
If you see multiple cores running when using this, which is equivalent to using plan(sequential)
, then there's something else that runs in parallel, which is not using the future framework, e.g. multithreaded Rcpp code. See what you get with purrr:map()
Thanks for the reply. That could be the reason. I think I will still keep using plan(multicore) in conjunction with future_Map in RStudio on my Linux OS in case you implement this support in the future. Do you think there will be any unexpected side effects of doing this?
Forked processing can crash RStudio. Not all R packages are fork-proof. That's why we recommend against it. It's nothing future, or any other R parallelization framework can fix.
OK. If I use plan(sequential) with future_sapply, will that be faster than R's base sapply?
then there's something else that runs in parallel, which is not using the future framework, e.g. multithreaded Rcpp code
Another possible source: multithreaded BLAS.
When I use plan(multicore) in conjunction with future_Map in RStudio, R seems to use multiple cores. However, your future GitHub website says that using multiple cores in RStudio is not supported, which confuses me. I am using future.apply_1.7.0 in R 4.0.5 on Fedora 34 Linux OS. Any clarification would be greatly appreciated.