yanyachen / rBayesianOptimization

Bayesian Optimization of Hyperparameters
81 stars 21 forks source link

xgboost cpu utilization inside BayesianOptimization #14

Closed yilisg closed 7 years ago

yilisg commented 7 years ago

Another quirk I noticed is that the cpu/core utilization seems to be fairly low when running xgboost inside BayesianOptimization (hovers around 25-30% on a 8/16 core/thread system). If I were to run the xgb.cv seperately outside BayesianOptimization, I would go back to the typical 95-100% cpu utilization by xgboost (since it is designed to be efficiently multi-core parallel). Please see the two screenshots and the code to reproduce them.

Do you observe the same effect and if so, what could be the reason or a workaround? Thanks!

# Example 2: Parameter Tuning
library(xgboost)
data(agaricus.train, package = "xgboost")
dtrain <- xgb.DMatrix(agaricus.train$data,
                      label = agaricus.train$label)
cv_folds <- KFold(agaricus.train$label, nfolds = 5,
                  stratified = TRUE, seed = 0)
xgb_cv_bayes <- function(max.depth, min_child_weight, subsample) {
  cv <- xgb.cv(params = list(booster = "gbtree", eta = 0.01,
                             max_depth = max.depth,
                             min_child_weight = min_child_weight,
                             subsample = subsample, colsample_bytree = 0.3,
                             lambda = 1, alpha = 0,
                             objective = "binary:logistic",
                             eval_metric = "auc"),
               data = dtrain, nround = 1000,
               folds = cv_folds, prediction = TRUE, showsd = TRUE,
               early_stopping_rounds = 1000, maximize = TRUE, verbose = 1)
  list(Score = max(cv$evaluation_log$test_auc_mean),
       Pred = cv$pred)
}

# running xgb.cv alone ~close to 95-100% cpu utilization (first screenshot)
xgb.cv(params = list(booster = "gbtree", eta = 0.01,
                     max_depth = 20L,
                     min_child_weight = 1L,
                     subsample = 0.8, colsample_bytree = 0.3,
                     lambda = 1, alpha = 0,
                     objective = "binary:logistic",
                     eval_metric = "auc"),
       data = dtrain, nround = 1000,
       folds = cv_folds, prediction = TRUE, showsd = TRUE,
       early_stopping_rounds = 1000, maximize = TRUE, verbose = 1)

# running xgb.cv inside BayesianOptimization ~at most 30% cpu utilization (second screenshot)
OPT_Res <- BayesianOptimization(xgb_cv_bayes,
                                bounds = list(max.depth = c(10L, 20L),
                                              min_child_weight = c(1L, 10L),
                                              subsample = c(0.5, 0.8)),
                                init_grid_dt = NULL, init_points = 10, n_iter = 20,
                                acq = "ei", kappa = 2.576, eps = 5.0,
                                verbose = TRUE)

rbayesianoptimization1

rbayesianoptimization2

yanyachen commented 7 years ago

I don't have same observation on my Mac and my AWS Linux server. I observed that on my Windows Laptop, xgb.cv will usually make about 80% CPU usage. but in BayesianOptimization it will make about 70% CPU usage, though there is barely any running time difference.

yilisg commented 7 years ago

Thanks. I restarted and could get around 60% CPU in windows. Also no issues for me in Ubuntu so it is probably OS dependent.

brebbles commented 5 years ago

I've noticed this issue as well, but while watching task manager I think I might have a theory (in my case at least).

XGBoost can run in parallel, but I'm not sure that rBayesianOptimization can. So when rBayesianOptimization calls on XGBoost to run CV, in the time it takes to run the CV it will be running multi-threaded, but then when the results of the CV are passed to rBayesianOptimization then it reverts back to 1 thread in order to run the Gaussian Process needed to come up with the next set of Hyperparamaters. In my experience this part of the process can take some time, epecially if n_iter becomes high (even greater than 50 in my experience)

Apologies if I've made a mistake here and that rBayesianOptimization can indeed run in parallel but to my knowledge I've not found this yet.