Open NamLQ opened 7 years ago
I have profiled this before, It's because that the GPfit package is slow when you have many hyper-parameters (dimensions). I may try to switch to another faster GP backend later. For now, you can try to focus on the most important 3 or 4 hyper-parameters first (e.g max_depth, min_child_weight, subsample, colsample_bytree).
@NamLQ @yanyachen On a related note, I observed that when n_iter round gets large (say more than 100), the process become substantially slower. So if you run the snippet below (only 3-dimensional search), while the "elapsed" is always around 1s +/- 0.5s per round, the "actual physical time" between Round 1 to 100, and Round 101 to Round 200 is very different. Not sure if this is related to my issue #14, perhaps the GP_fit is doing something (very slowly, single-threaded) when round gets larger while xgboost is just idling and waiting to be re-invoked?
OPT_Res <- BayesianOptimization(xgb_cv_bayes,
bounds = list(max.depth = c(10L, 20L),
min_child_weight = c(1L, 10L),
subsample = c(0.5, 0.8)),
init_grid_dt = NULL, init_points = 10, n_iter = 500,
acq = "ei", kappa = 2.576, eps = 5.0,
verbose = TRUE)
@yilisg I'm sure this is because of GP_fit. You can try this https://github.com/HIPS/Spearmint
The main drawback of BayesianOptimization in term of speed is from GPfit ::GP_fit function. I choose GPfit package at first, because it's the most convenient one to use. I'm trying to find a alternative Gaussian Process package (using Rcpp) on CRAN. If you guys have any recommendation, please let me know. Thanks.
I see. Thanks both.
Hello, I really like the package but the speed is an issue. GauPro might be worth a try. It seems to very new and the documentation is still very short, hence I am not sure if this is suited. https://cran.r-project.org/web/packages/GauPro/index.html
The estimation is done like this. gp <- GauPro$new(X=x, Z=y, parallel=FALSE)
and the prediciton gp$predict(XX = x)
I have checked this package before, and I think this one is the fastest GP package by far. But GauPro doesn't provide kernels for BayesianOptimization and doesn't support user provided kernel/correlation function, so I didn't use this one. I will definitely switch to this one, when it support kernels for BayesianOptimization.
Hey, Have you checked laGP? https://cran.r-project.org/web/packages/laGP/laGP.pdf
Also there is a paper about GP in R. https://www.google.pl/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&uact=8&ved=0ahUKEwiG79_hl97WAhWBA5oKHVmOB9cQFggyMAA&url=https%3A%2F%2Fwww.jstatsoft.org%2Farticle%2Fview%2Fv063i10%2Fv63i10.pdf&usg=AOvVaw3u-hmEArsBDyda7Y3Cw31t
GauPro updated in Sep 2017. May be you can use it now?
Wery slow. I try to calculate 20 dimensions with 0:1L (digital 0 or 1) FUN - very fast (= 0 sec.) - just return (1-sum() for dimensions). Time to calculate 1:23:46.
[1] "Start time: 2018-05-08 12:16:06"
Loading required package: rBayesianOptimization
elapsed = 0.00 Round = 1 1 = 1.0000 2 = 0.0000 3 = 1.0000 4 = 0.0000 5 = 0.0000 6 = 1.0000 7 = 1.0000 8 = 1.0000 9 = 1.0000 10 = 0.0000 11 = 1.0000 12 = 1.0000 13 = 1.0000 14 = 1.0000 15 = 0.0000 16 = 0.0000 17 = 1.0000 18 = 1.0000 19 = 0.0000 20 = 0.0000 Value = -11.0000
elapsed = 0.00 Round = 2 1 = 1.0000 2 = 1.0000 3 = 1.0000 4 = 1.0000 5 = 0.0000 6 = 0.0000 7 = 0.0000 8 = 0.0000 9 = 0.0000 10 = 1.0000 11 = 0.0000 12 = 1.0000 13 = 0.0000 14 = 0.0000 15 = 0.0000 16 = 0.0000 17 = 1.0000 18 = 0.0000 19 = 1.0000 20 = 0.0000 Value = -7.0000
elapsed = 0.00 Round = 3 1 = 0.0000 2 = 0.0000 3 = 1.0000 4 = 0.0000 5 = 1.0000 6 = 0.0000 7 = 1.0000 8 = 1.0000 9 = 0.0000 10 = 0.0000 11 = 1.0000 12 = 0.0000 13 = 1.0000 14 = 1.0000 15 = 0.0000 16 = 0.0000 17 = 0.0000 18 = 1.0000 19 = 1.0000 20 = 1.0000 Value = -9.0000
elapsed = 0.00 Round = 4 1 = 0.0000 2 = 1.0000 3 = 1.0000 4 = 0.0000 5 = 1.0000 6 = 1.0000 7 = 0.0000 8 = 1.0000 9 = 0.0000 10 = 0.0000 11 = 0.0000 12 = 0.0000 13 = 1.0000 14 = 1.0000 15 = 0.0000 16 = 1.0000 17 = 0.0000 18 = 0.0000 19 = 0.0000 20 = 0.0000 Value = -7.0000
elapsed = 0.00 Round = 5 1 = 1.0000 2 = 1.0000 3 = 1.0000 4 = 0.0000 5 = 1.0000 6 = 1.0000 7 = 1.0000 8 = 0.0000 9 = 1.0000 10 = 0.0000 11 = 0.0000 12 = 0.0000 13 = 0.0000 14 = 0.0000 15 = 1.0000 16 = 1.0000 17 = 1.0000 18 = 1.0000 19 = 0.0000 20 = 0.0000 Value = -10.0000
elapsed = 0.00 Round = 6 1 = 0.0000 2 = 1.0000 3 = 0.0000 4 = 1.0000 5 = 1.0000 6 = 1.0000 7 = 0.0000 8 = 1.0000 9 = 0.0000 10 = 1.0000 11 = 1.0000 12 = 1.0000 13 = 1.0000 14 = 1.0000 15 = 1.0000 16 = 1.0000 17 = 1.0000 18 = 1.0000 19 = 1.0000 20 = 0.0000 Value = -14.0000
elapsed = 0.00 Round = 7 1 = 0.0000 2 = 1.0000 3 = 1.0000 4 = 1.0000 5 = 0.0000 6 = 0.0000 7 = 0.0000 8 = 1.0000 9 = 0.0000 10 = 1.0000 11 = 1.0000 12 = 0.0000 13 = 0.0000 14 = 0.0000 15 = 0.0000 16 = 0.0000 17 = 1.0000 18 = 1.0000 19 = 1.0000 20 = 1.0000 Value = -9.0000
elapsed = 0.00 Round = 8 1 = 1.0000 2 = 0.0000 3 = 0.0000 4 = 0.0000 5 = 1.0000 6 = 0.0000 7 = 1.0000 8 = 0.0000 9 = 0.0000 10 = 0.0000 11 = 0.0000 12 = 1.0000 13 = 1.0000 14 = 1.0000 15 = 0.0000 16 = 0.0000 17 = 1.0000 18 = 1.0000 19 = 1.0000 20 = 0.0000 Value = -8.0000
elapsed = 0.00 Round = 9 1 = 0.0000 2 = 0.0000 3 = 0.0000 4 = 0.0000 5 = 0.0000 6 = 1.0000 7 = 0.0000 8 = 0.0000 9 = 0.0000 10 = 1.0000 11 = 0.0000 12 = 1.0000 13 = 1.0000 14 = 0.0000 15 = 0.0000 16 = 1.0000 17 = 1.0000 18 = 1.0000 19 = 0.0000 20 = 0.0000 Value = -6.0000
elapsed = 0.00 Round = 10 1 = 0.0000 2 = 0.0000 3 = 0.0000 4 = 1.0000 5 = 1.0000 6 = 0.0000 7 = 0.0000 8 = 0.0000 9 = 1.0000 10 = 1.0000 11 = 0.0000 12 = 1.0000 13 = 1.0000 14 = 1.0000 15 = 0.0000 16 = 0.0000 17 = 1.0000 18 = 1.0000 19 = 0.0000 20 = 0.0000 Value = -8.0000
elapsed = 0.00 Round = 11 1 = 0.0000 2 = 1.0000 3 = 1.0000 4 = 1.0000 5 = 0.0000 6 = 0.0000 7 = 1.0000 8 = 0.0000 9 = 0.0000 10 = 0.0000 11 = 0.0000 12 = 0.0000 13 = 1.0000 14 = 1.0000 15 = 1.0000 16 = 0.0000 17 = 1.0000 18 = 1.0000 19 = 1.0000 20 = 0.0000 Value = -9.0000
elapsed = 0.00 Round = 12 1 = 1.0000 2 = 1.0000 3 = 1.0000 4 = 0.0000 5 = 0.0000 6 = 0.0000 7 = 0.0000 8 = 0.0000 9 = 0.0000 10 = 1.0000 11 = 0.0000 12 = 1.0000 13 = 0.0000 14 = 1.0000 15 = 1.0000 16 = 0.0000 17 = 1.0000 18 = 1.0000 19 = 1.0000 20 = 0.0000 Value = -9.0000
elapsed = 0.00 Round = 13 1 = 0.0000 2 = 1.0000 3 = 1.0000 4 = 0.0000 5 = 0.0000 6 = 1.0000 7 = 0.0000 8 = 0.0000 9 = 1.0000 10 = 0.0000 11 = 0.0000 12 = 1.0000 13 = 1.0000 14 = 0.0000 15 = 0.0000 16 = 0.0000 17 = 0.0000 18 = 0.0000 19 = 1.0000 20 = 1.0000 Value = -7.0000
elapsed = 0.00 Round = 14 1 = 0.0000 2 = 0.0000 3 = 1.0000 4 = 0.0000 5 = 1.0000 6 = 0.0000 7 = 1.0000 8 = 1.0000 9 = 1.0000 10 = 1.0000 11 = 0.0000 12 = 1.0000 13 = 0.0000 14 = 1.0000 15 = 0.0000 16 = 0.0000 17 = 1.0000 18 = 0.0000 19 = 1.0000 20 = 1.0000 Value = -10.0000
elapsed = 0.00 Round = 15 1 = 0.0000 2 = 0.0000 3 = 0.0000 4 = 1.0000 5 = 1.0000 6 = 0.0000 7 = 1.0000 8 = 0.0000 9 = 1.0000 10 = 0.0000 11 = 1.0000 12 = 0.0000 13 = 0.0000 14 = 1.0000 15 = 0.0000 16 = 1.0000 17 = 0.0000 18 = 1.0000 19 = 0.0000 20 = 0.0000 Value = -7.0000
elapsed = 0.00 Round = 16 1 = 1.0000 2 = 0.0000 3 = 0.0000 4 = 0.0000 5 = 0.0000 6 = 0.0000 7 = 0.0000 8 = 1.0000 9 = 1.0000 10 = 0.0000 11 = 1.0000 12 = 1.0000 13 = 0.0000 14 = 0.0000 15 = 0.0000 16 = 1.0000 17 = 0.0000 18 = 0.0000 19 = 0.0000 20 = 1.0000 Value = -6.0000
elapsed = 0.00 Round = 17 1 = 0.0000 2 = 1.0000 3 = 1.0000 4 = 1.0000 5 = 0.0000 6 = 1.0000 7 = 1.0000 8 = 0.0000 9 = 0.0000 10 = 1.0000 11 = 1.0000 12 = 1.0000 13 = 0.0000 14 = 0.0000 15 = 0.0000 16 = 1.0000 17 = 1.0000 18 = 0.0000 19 = 1.0000 20 = 0.0000 Value = -10.0000
elapsed = 0.00 Round = 18 1 = 1.0000 2 = 1.0000 3 = 0.0000 4 = 0.0000 5 = 0.0000 6 = 1.0000 7 = 1.0000 8 = 0.0000 9 = 1.0000 10 = 1.0000 11 = 1.0000 12 = 1.0000 13 = 0.0000 14 = 1.0000 15 = 0.0000 16 = 0.0000 17 = 0.0000 18 = 1.0000 19 = 0.0000 20 = 1.0000 Value = -10.0000
elapsed = 0.00 Round = 19 1 = 0.0000 2 = 1.0000 3 = 0.0000 4 = 0.0000 5 = 0.0000 6 = 1.0000 7 = 1.0000 8 = 1.0000 9 = 0.0000 10 = 0.0000 11 = 0.0000 12 = 0.0000 13 = 0.0000 14 = 1.0000 15 = 1.0000 16 = 1.0000 17 = 1.0000 18 = 0.0000 19 = 1.0000 20 = 0.0000 Value = -8.0000
elapsed = 0.00 Round = 20 1 = 1.0000 2 = 0.0000 3 = 0.0000 4 = 0.0000 5 = 0.0000 6 = 1.0000 7 = 1.0000 8 = 0.0000 9 = 0.0000 10 = 0.0000 11 = 1.0000 12 = 1.0000 13 = 0.0000 14 = 0.0000 15 = 0.0000 16 = 1.0000 17 = 0.0000 18 = 0.0000 19 = 0.0000 20 = 0.0000 Value = -5.0000
Best Parameters Found:
Round = 20 1 = 1.0000 2 = 0.0000 3 = 0.0000 4 = 0.0000 5 = 0.0000 6 = 1.0000 7 = 1.0000 8 = 0.0000 9 = 0.0000 10 = 0.0000 11 = 1.0000 12 = 1.0000 13 = 0.0000 14 = 0.0000 15 = 0.0000 16 = 1.0000 17 = 0.0000 18 = 0.0000 19 = 0.0000 20 = 0.0000 Value = -5.0000
[1] "End time: 2018-05-08 13:39:52 Work time: 1:23:46"
@alex7tula What is the context for you last comment? Was that using your own GauPro backend? Using rBayesianOptimization with XGBoost, seeing the same thing as you guys: majority of time spent fitting GP compared to building trees. Wondering if it is worth swapping out GPFit.
Hi!
I am using this package to optimize 9 hyper-parameters of xgboost: eta, max_depth, min_child_weight, gamma, subsample, colsample_bytree, colsample_bylevel, lambda, alpha. It is two slow for the calculation of new hyper parameters. How can I make it faster?
Thank you very much.