mlr-org / mlr3

mlr3: Machine Learning in R - next generation
https://mlr3.mlr-org.com
GNU Lesser General Public License v3.0
941 stars 85 forks source link

nthread not working in R on linux #808

Closed Sudolin closed 2 years ago

Sudolin commented 2 years ago

Hi! I found that surv.xgboost and surv.svm always cause cpus overload without control when I set nthread as 1 and multisession more than 2 during benchmarking. Is there a way to control the thread using when benchmarking these learners in nested cv. Below is an exampe of the code I use. Thanks.

library(mlr3)
library("mlr3pipelines")
library(mlr3verse)
library(mlr3extralearners)
library(mlr3learners)
library(mlr3proba) 
library(mlr3tuning)

measure = msr("surv.cindex")
inner_resampling = rsmp("cv", folds = 5) 
outer_resampling <- rsmp("cv", folds = 3) 
terminator = trm("none") 
tuner = tnr("grid_search", resolution = 10) 
xgb  <- as_learner(ppl("distrcompositor",
                       learner  = lrn('surv.xgboost',id="xgb",
                                      nthread=1,
                                      eta = to_tune(p_dbl(lower = 0.01, upper = 0.3)), 
                                      nrounds = to_tune(p_int(lower = 1, upper = 1000)),
                                      max_depth = to_tune(p_int(lower = 2, upper = 10))
                       )))
lrn.xgb <- AutoTuner$new(
  learner = xgb,
  resampling = inner_resampling,
  measure = measure,
  terminator = terminator,
  tuner = tuner
)
grid = benchmark_grid(
  task = sur_list$tcga,
  learner =   lrn.xgb,
  resampling = outer_resampling
)
 future::plan("multisession",workers=4) # this cause way more than 4 cores being used in my case
bmr = benchmark(grid,store_models = T)
be-marc commented 2 years ago

Sorry for the late reply. Problems with parallelization are difficult to reproduce.

CPUs overload without control

Your system is not responding anymore?

surv.xgboost and surv.svm

The surv.svm learner has no nthread parameter but you encounter the same issue?