Open atombaby opened 4 years ago
Expanding the scope a bit to include other libraries that have built-in threading available. Looking at you xgboost
more xgboost references:
https://xgboost.readthedocs.io/en/latest/R-package/xgboostPresentation.html https://xgboost.readthedocs.io/en/latest/parameter.html
it's relationship to caret:
https://github.com/topepo/caret/issues/870 https://topepo.github.io/caret/model-training-and-tuning.html#an-example
Proposed Domain
Computing? Likely not in the parallel computing guide as this is a little bit of a corner case
Content Summary
BLAS will (by default) attempt to use all available cores on a node. This can be controlled with environment variables (set
OPENBLAS_NUM_THREADS
), at call time (usingopenblas_set_num_threads(1)
or using RhpcBLASctl)However- if you have a loop (like
mclapply
which calls an OpenBLAS function) each of the applied functions will each call that BLAS function that will use all available processors.We need to document and demonstrate this problem, examine ways to ensure the job is working within its allocation.
reference: https://fossies.org/linux/OpenBLAS/USAGE.md
Local Content Expert(s) Suggest any Fred Hutch based experts who we might ask to contribute (GitHub ID is preferred, but name of someone and/or desired expertise is ok too).