MaastrichtU-IDS / dsri-documentation

📖 Documentation for the Data Science Research Infrastructure at Maastricht University
https://dsri.maastrichtuniversity.nl
MIT License
21 stars 8 forks source link

RStudio containers use OpenBLAS with no restriction on the number of cores / threads #19

Closed wviechtb closed 3 years ago

wviechtb commented 3 years ago

The RStudio containers are configured to use OpenBLAS (great!). However, they do not put any restriction on the number of cores / threads that OpenBLAS is allowed to use. This is not so great since using all 64 cores / 128 threads (which will be used by default) usually ends up hampering performance. An example where I create a 200x200 matrix and then take the inverse of this 100 times:

X <- MASS::mvrnorm(10000, mu=rep(0,200), Sigma=diag(200))
Z <- t(X) %*% X

system.time(tmp <- replicate(100, {solve(Z)[1,1]}))

This takes 20+ seconds and all 128 threads are at close to 100% utilization (in other words, the entire node is being saturated).

Now let's restrict the number of threads that OpenBLAS is allowed to use:

install.packages("RhpcBLASctl")
library(RhpcBLASctl)
blas_get_num_procs() # note that this is 64 by default
blas_set_num_threads(1) # set to 1

system.time(tmp <- replicate(100, {solve(Z)[1,1]}))

This now takes around 0.2 seconds and only a single thread is being used.

The problem with using all cores (implicitly) will be even more magnified if one uses explicit parallelization, since all workers then use all 64 cores and things will slow down to a crawl.

Usually, the number of cores is set via an environmental variable:

export OPENBLAS_NUM_THREADS=1

This needs to happen before R/RStudio is started. Setting the environmental variable from within R with:

Sys.setenv(OPENBLAS_NUM_THREADS=1)

does not work.

So, it would be great if the RStudio containers could be configured to set the environmental variable as described above. Those who need more cores for their matrix algebra stuff (and know what they are doing) can still use the RhpcBLASctl package to adjust the thread number.

vemonet commented 3 years ago

Thanks a lot for reporting this! @wviechtb

I set this environment variable in the rocker/rstudio image we use for RStudio with root user deployment: https://github.com/MaastrichtU-IDS/dsri-openshift-applications/blob/main/templates-anyuid/template-rstudio-root-persistent.yml#L69

- name: OPENBLAS_NUM_THREADS
  displayName: Number of threads for OpenBLAS
  description: Restricting the number of thread allocated to OpenBLAS can speed up computations using OpenBLAS (leave empty otherwise)
  value: ""
  required: false

And:

env:
- name: OPENBLAS_NUM_THREADS
  value: "${OPENBLAS_NUM_THREADS}"

Could you try it in your project to let me know if it improves the performances for you?

oc apply -f https://raw.githubusercontent.com/MaastrichtU-IDS/dsri-openshift-applications/main/templates-anyuid/template-rstudio-root-persistent.yml

Also feel free to let me know if you have a better description for the parameter! Or a different default value (I was thinking to leave it empty to keep the original behavior by default, but maybe I need to set it to 0?)

wviechtb commented 3 years ago

Thanks for getting started on this! But OPENBLAS_NUM_THREADS needs to be set to 1, not blank. When blank, the default is used (which will be 64 on the DSRI nodes).

vemonet commented 3 years ago

Yes, but now you can choose the number of threads for OpenBLAS when you start a RStudio app from the template:

Screenshot from 2020-12-02 18-07-57

And you can set it to 1

I updated the RStudio with root user template in your project

wviechtb commented 3 years ago

Ah, I see! I would suggest to fill in a 1 by default in the template though.

vemonet commented 3 years ago

Ok, I updated the template default value to 1

wviechtb commented 3 years ago

Tried it out and works as intended. Thanks! I think the issue can be closed now.