PLN-team / PLNmodels

A collection of Poisson lognormal models for multivariate count data analysis
https://pln-team.github.io/PLNmodels
GNU General Public License v3.0
54 stars 18 forks source link

Use parallel computing via future_lapply only when appropriate (detect multithreaded backend) #111

Open jchiquet opened 10 months ago

jchiquet commented 10 months ago

As mentioned by in PR https://github.com/PLN-team/PLNmodels/pull/110 by Cole Trapnell, future_lapply can significantly slow down the computation time when multicore plan is active on multithreaded backend (like OpenBlas) :

The issue is that on machines that use OpenBLAS with a multithreaded backend, using future can deadlock the session. A workaround is to wrap calls to future with something like this:

old_omp_num_threads = as.numeric(Sys.getenv("OMP_NUM_THREADS")) if (is.na(old_omp_num_threads)){ old_omp_num_threads = 1 } RhpcBLASctl::omp_set_num_threads(1)

old_blas_num_threads = as.numeric(Sys.getenv("OPENBLAS_NUM_THREADS")) if (is.na(old_omp_num_threads)){ old_blas_num_threads = 1 } RhpcBLASctl::blas_set_num_threads(1) Then you do work with future and then:

RhpcBLASctl::omp_set_num_threads(old_omp_num_threads) RhpcBLASctl::blas_set_num_threads(old_blas_num_threads) We didn't add this because we didn't want to add a new dependency on RhpcBLASctl to the package, but you could do if you want to be able to do linear algebra inside of functions called by future

I suggest defining a PLN_lapply function which, depending on the architecture in place, directs towards a classic or multicore lapply. See if future is capable of this (via 'sequential' or 'multicore' plan).