yanyachen / rBayesianOptimization

Bayesian Optimization of Hyperparameters
84 stars 21 forks source link

How to parallel the process? #11

Open NamLQ opened 7 years ago

NamLQ commented 7 years ago

Hi!

I am using this package to optimize 9 hyper-parameters of xgboost: eta, max_depth, min_child_weight, gamma, subsample, colsample_bytree, colsample_bylevel, lambda, alpha. It is two slow for the calculation of new hyper parameters. How can I make it faster?

Thank you very much.

yanyachen commented 7 years ago

I have profiled this before, It's because that the GPfit package is slow when you have many hyper-parameters (dimensions). I may try to switch to another faster GP backend later. For now, you can try to focus on the most important 3 or 4 hyper-parameters first (e.g max_depth, min_child_weight, subsample, colsample_bytree).

yilisg commented 7 years ago

@NamLQ @yanyachen On a related note, I observed that when n_iter round gets large (say more than 100), the process become substantially slower. So if you run the snippet below (only 3-dimensional search), while the "elapsed" is always around 1s +/- 0.5s per round, the "actual physical time" between Round 1 to 100, and Round 101 to Round 200 is very different. Not sure if this is related to my issue #14, perhaps the GP_fit is doing something (very slowly, single-threaded) when round gets larger while xgboost is just idling and waiting to be re-invoked?

OPT_Res <- BayesianOptimization(xgb_cv_bayes,
                                bounds = list(max.depth = c(10L, 20L),
                                              min_child_weight = c(1L, 10L),
                                              subsample = c(0.5, 0.8)),
                                init_grid_dt = NULL, init_points = 10, n_iter = 500,
                                acq = "ei", kappa = 2.576, eps = 5.0,
                                verbose = TRUE)
NamLQ commented 7 years ago

@yilisg I'm sure this is because of GP_fit. You can try this https://github.com/HIPS/Spearmint

yanyachen commented 7 years ago

The main drawback of BayesianOptimization in term of speed is from GPfit ::GP_fit function. I choose GPfit package at first, because it's the most convenient one to use. I'm trying to find a alternative Gaussian Process package (using Rcpp) on CRAN. If you guys have any recommendation, please let me know. Thanks.

yilisg commented 7 years ago

I see. Thanks both.

ChJahns commented 7 years ago

Hello, I really like the package but the speed is an issue. GauPro might be worth a try. It seems to very new and the documentation is still very short, hence I am not sure if this is suited. https://cran.r-project.org/web/packages/GauPro/index.html

The estimation is done like this. gp <- GauPro$new(X=x, Z=y, parallel=FALSE)

and the prediciton gp$predict(XX = x)

yanyachen commented 7 years ago

I have checked this package before, and I think this one is the fastest GP package by far. But GauPro doesn't provide kernels for BayesianOptimization and doesn't support user provided kernel/correlation function, so I didn't use this one. I will definitely switch to this one, when it support kernels for BayesianOptimization.

BartekRoszak commented 6 years ago

Hey, Have you checked laGP? https://cran.r-project.org/web/packages/laGP/laGP.pdf

Also there is a paper about GP in R. https://www.google.pl/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&uact=8&ved=0ahUKEwiG79_hl97WAhWBA5oKHVmOB9cQFggyMAA&url=https%3A%2F%2Fwww.jstatsoft.org%2Farticle%2Fview%2Fv063i10%2Fv63i10.pdf&usg=AOvVaw3u-hmEArsBDyda7Y3Cw31t

alex7tula commented 6 years ago

GauPro updated in Sep 2017. May be you can use it now?

alex7tula commented 6 years ago

Wery slow. I try to calculate 20 dimensions with 0:1L (digital 0 or 1) FUN - very fast (= 0 sec.) - just return (1-sum() for dimensions). Time to calculate 1:23:46.

[1] "Start time: 2018-05-08 12:16:06"
Loading required package: rBayesianOptimization
elapsed = 0.00  Round = 1       1 = 1.0000      2 = 0.0000      3 = 1.0000      4 = 0.0000      5 = 0.0000      6 = 1.0000      7 = 1.0000      8 = 1.0000      9 = 1.0000      10 = 0.0000     11 = 1.0000     12 = 1.0000     13 = 1.0000     14 = 1.0000     15 = 0.0000     16 = 0.0000     17 = 1.0000     18 = 1.0000     19 = 0.0000     20 = 0.0000     Value = -11.0000 
elapsed = 0.00  Round = 2       1 = 1.0000      2 = 1.0000      3 = 1.0000      4 = 1.0000      5 = 0.0000      6 = 0.0000      7 = 0.0000      8 = 0.0000      9 = 0.0000      10 = 1.0000     11 = 0.0000     12 = 1.0000     13 = 0.0000     14 = 0.0000     15 = 0.0000     16 = 0.0000     17 = 1.0000     18 = 0.0000     19 = 1.0000     20 = 0.0000     Value = -7.0000 
elapsed = 0.00  Round = 3       1 = 0.0000      2 = 0.0000      3 = 1.0000      4 = 0.0000      5 = 1.0000      6 = 0.0000      7 = 1.0000      8 = 1.0000      9 = 0.0000      10 = 0.0000     11 = 1.0000     12 = 0.0000     13 = 1.0000     14 = 1.0000     15 = 0.0000     16 = 0.0000     17 = 0.0000     18 = 1.0000     19 = 1.0000     20 = 1.0000     Value = -9.0000 
elapsed = 0.00  Round = 4       1 = 0.0000      2 = 1.0000      3 = 1.0000      4 = 0.0000      5 = 1.0000      6 = 1.0000      7 = 0.0000      8 = 1.0000      9 = 0.0000      10 = 0.0000     11 = 0.0000     12 = 0.0000     13 = 1.0000     14 = 1.0000     15 = 0.0000     16 = 1.0000     17 = 0.0000     18 = 0.0000     19 = 0.0000     20 = 0.0000     Value = -7.0000 
elapsed = 0.00  Round = 5       1 = 1.0000      2 = 1.0000      3 = 1.0000      4 = 0.0000      5 = 1.0000      6 = 1.0000      7 = 1.0000      8 = 0.0000      9 = 1.0000      10 = 0.0000     11 = 0.0000     12 = 0.0000     13 = 0.0000     14 = 0.0000     15 = 1.0000     16 = 1.0000     17 = 1.0000     18 = 1.0000     19 = 0.0000     20 = 0.0000     Value = -10.0000 
elapsed = 0.00  Round = 6       1 = 0.0000      2 = 1.0000      3 = 0.0000      4 = 1.0000      5 = 1.0000      6 = 1.0000      7 = 0.0000      8 = 1.0000      9 = 0.0000      10 = 1.0000     11 = 1.0000     12 = 1.0000     13 = 1.0000     14 = 1.0000     15 = 1.0000     16 = 1.0000     17 = 1.0000     18 = 1.0000     19 = 1.0000     20 = 0.0000     Value = -14.0000 
elapsed = 0.00  Round = 7       1 = 0.0000      2 = 1.0000      3 = 1.0000      4 = 1.0000      5 = 0.0000      6 = 0.0000      7 = 0.0000      8 = 1.0000      9 = 0.0000      10 = 1.0000     11 = 1.0000     12 = 0.0000     13 = 0.0000     14 = 0.0000     15 = 0.0000     16 = 0.0000     17 = 1.0000     18 = 1.0000     19 = 1.0000     20 = 1.0000     Value = -9.0000 
elapsed = 0.00  Round = 8       1 = 1.0000      2 = 0.0000      3 = 0.0000      4 = 0.0000      5 = 1.0000      6 = 0.0000      7 = 1.0000      8 = 0.0000      9 = 0.0000      10 = 0.0000     11 = 0.0000     12 = 1.0000     13 = 1.0000     14 = 1.0000     15 = 0.0000     16 = 0.0000     17 = 1.0000     18 = 1.0000     19 = 1.0000     20 = 0.0000     Value = -8.0000 
elapsed = 0.00  Round = 9       1 = 0.0000      2 = 0.0000      3 = 0.0000      4 = 0.0000      5 = 0.0000      6 = 1.0000      7 = 0.0000      8 = 0.0000      9 = 0.0000      10 = 1.0000     11 = 0.0000     12 = 1.0000     13 = 1.0000     14 = 0.0000     15 = 0.0000     16 = 1.0000     17 = 1.0000     18 = 1.0000     19 = 0.0000     20 = 0.0000     Value = -6.0000 
elapsed = 0.00  Round = 10      1 = 0.0000      2 = 0.0000      3 = 0.0000      4 = 1.0000      5 = 1.0000      6 = 0.0000      7 = 0.0000      8 = 0.0000      9 = 1.0000      10 = 1.0000     11 = 0.0000     12 = 1.0000     13 = 1.0000     14 = 1.0000     15 = 0.0000     16 = 0.0000     17 = 1.0000     18 = 1.0000     19 = 0.0000     20 = 0.0000     Value = -8.0000 
elapsed = 0.00  Round = 11      1 = 0.0000      2 = 1.0000      3 = 1.0000      4 = 1.0000      5 = 0.0000      6 = 0.0000      7 = 1.0000      8 = 0.0000      9 = 0.0000      10 = 0.0000     11 = 0.0000     12 = 0.0000     13 = 1.0000     14 = 1.0000     15 = 1.0000     16 = 0.0000     17 = 1.0000     18 = 1.0000     19 = 1.0000     20 = 0.0000     Value = -9.0000 
elapsed = 0.00  Round = 12      1 = 1.0000      2 = 1.0000      3 = 1.0000      4 = 0.0000      5 = 0.0000      6 = 0.0000      7 = 0.0000      8 = 0.0000      9 = 0.0000      10 = 1.0000     11 = 0.0000     12 = 1.0000     13 = 0.0000     14 = 1.0000     15 = 1.0000     16 = 0.0000     17 = 1.0000     18 = 1.0000     19 = 1.0000     20 = 0.0000     Value = -9.0000 
elapsed = 0.00  Round = 13      1 = 0.0000      2 = 1.0000      3 = 1.0000      4 = 0.0000      5 = 0.0000      6 = 1.0000      7 = 0.0000      8 = 0.0000      9 = 1.0000      10 = 0.0000     11 = 0.0000     12 = 1.0000     13 = 1.0000     14 = 0.0000     15 = 0.0000     16 = 0.0000     17 = 0.0000     18 = 0.0000     19 = 1.0000     20 = 1.0000     Value = -7.0000 
elapsed = 0.00  Round = 14      1 = 0.0000      2 = 0.0000      3 = 1.0000      4 = 0.0000      5 = 1.0000      6 = 0.0000      7 = 1.0000      8 = 1.0000      9 = 1.0000      10 = 1.0000     11 = 0.0000     12 = 1.0000     13 = 0.0000     14 = 1.0000     15 = 0.0000     16 = 0.0000     17 = 1.0000     18 = 0.0000     19 = 1.0000     20 = 1.0000     Value = -10.0000 
elapsed = 0.00  Round = 15      1 = 0.0000      2 = 0.0000      3 = 0.0000      4 = 1.0000      5 = 1.0000      6 = 0.0000      7 = 1.0000      8 = 0.0000      9 = 1.0000      10 = 0.0000     11 = 1.0000     12 = 0.0000     13 = 0.0000     14 = 1.0000     15 = 0.0000     16 = 1.0000     17 = 0.0000     18 = 1.0000     19 = 0.0000     20 = 0.0000     Value = -7.0000 
elapsed = 0.00  Round = 16      1 = 1.0000      2 = 0.0000      3 = 0.0000      4 = 0.0000      5 = 0.0000      6 = 0.0000      7 = 0.0000      8 = 1.0000      9 = 1.0000      10 = 0.0000     11 = 1.0000     12 = 1.0000     13 = 0.0000     14 = 0.0000     15 = 0.0000     16 = 1.0000     17 = 0.0000     18 = 0.0000     19 = 0.0000     20 = 1.0000     Value = -6.0000 
elapsed = 0.00  Round = 17      1 = 0.0000      2 = 1.0000      3 = 1.0000      4 = 1.0000      5 = 0.0000      6 = 1.0000      7 = 1.0000      8 = 0.0000      9 = 0.0000      10 = 1.0000     11 = 1.0000     12 = 1.0000     13 = 0.0000     14 = 0.0000     15 = 0.0000     16 = 1.0000     17 = 1.0000     18 = 0.0000     19 = 1.0000     20 = 0.0000     Value = -10.0000 
elapsed = 0.00  Round = 18      1 = 1.0000      2 = 1.0000      3 = 0.0000      4 = 0.0000      5 = 0.0000      6 = 1.0000      7 = 1.0000      8 = 0.0000      9 = 1.0000      10 = 1.0000     11 = 1.0000     12 = 1.0000     13 = 0.0000     14 = 1.0000     15 = 0.0000     16 = 0.0000     17 = 0.0000     18 = 1.0000     19 = 0.0000     20 = 1.0000     Value = -10.0000 
elapsed = 0.00  Round = 19      1 = 0.0000      2 = 1.0000      3 = 0.0000      4 = 0.0000      5 = 0.0000      6 = 1.0000      7 = 1.0000      8 = 1.0000      9 = 0.0000      10 = 0.0000     11 = 0.0000     12 = 0.0000     13 = 0.0000     14 = 1.0000     15 = 1.0000     16 = 1.0000     17 = 1.0000     18 = 0.0000     19 = 1.0000     20 = 0.0000     Value = -8.0000 
elapsed = 0.00  Round = 20      1 = 1.0000      2 = 0.0000      3 = 0.0000      4 = 0.0000      5 = 0.0000      6 = 1.0000      7 = 1.0000      8 = 0.0000      9 = 0.0000      10 = 0.0000     11 = 1.0000     12 = 1.0000     13 = 0.0000     14 = 0.0000     15 = 0.0000     16 = 1.0000     17 = 0.0000     18 = 0.0000     19 = 0.0000     20 = 0.0000     Value = -5.0000 

 Best Parameters Found: 
Round = 20      1 = 1.0000      2 = 0.0000      3 = 0.0000      4 = 0.0000      5 = 0.0000      6 = 1.0000      7 = 1.0000      8 = 0.0000      9 = 0.0000      10 = 0.0000     11 = 1.0000     12 = 1.0000     13 = 0.0000     14 = 0.0000     15 = 0.0000     16 = 1.0000     17 = 0.0000     18 = 0.0000     19 = 0.0000     20 = 0.0000     Value = -5.0000 
[1] "End time: 2018-05-08 13:39:52 Work time: 1:23:46"
openclosure commented 6 years ago

@alex7tula What is the context for you last comment? Was that using your own GauPro backend? Using rBayesianOptimization with XGBoost, seeing the same thing as you guys: majority of time spent fitting GP compared to building trees. Wondering if it is worth swapping out GPFit.