ecpolley / SuperLearner

Current version of the SuperLearner R package
272 stars 72 forks source link

SL.gbm n.cores #141

Closed upenn-hughmac closed 2 years ago

upenn-hughmac commented 2 years ago

Hi SL team,

We're using SuperLearner in a cluster environment, and so need to be able to carefully control the number of cores that each job uses, typically limiting to one, or whatever the user requests, but generally never all cores.

By default, gbm uses detectCores to determine number of cores. In that package I can set n.cores = N to change that behavior, but SL.gbm doesn't seem to have n.cores as an option that could be modified per The Guide.

If I modify the source code to add the n.cores option with a default (I like 1, as other 'parallel capable' methods use, but ...), I can then customize the model's n.cores hyperparameter, and control the cores.

Any chance the SL.gbm.R code could be modified to add that option? While I got it working, I'm not confident enough to submit a pull request.

-Hugh

ecpolley commented 2 years ago

Hi Hugh,

Thanks for the suggestion. I updated SL.gbm to include the n.cores argument, but left it to match the default in the gbm function. Users could modify this easily with something like:

SL.gbm1 <- function(...) SL.gbm(n.cores = 1, ...) # only use 1 core

-Eric

upenn-hughmac commented 2 years ago

Works like a charm, thanks so much! Now have some pretty big single-core work going without core contention.