yanyachen / rBayesianOptimization

Bayesian Optimization of Hyperparameters
81 stars 21 forks source link

use initial grid as input #1

Closed topepo closed 8 years ago

topepo commented 8 years ago

Excellent package. I was on my way to writing this but I'm happy that you got there first =]

As an alternative to specifying the initial number of points, it would be great to be able to pass in a data frame/table with the initial points. I use random search and would like to see where the Bayesian optimization would go with it. Those previous settings would be a great way to seed your algorithm. If if we could also pass in a Value column, that would be even better so that we wouldn't have to recompute the models.

Thanks,

Max

yanyachen commented 8 years ago

Thank you very much. I was thinking about initial grid too, but I had 2 major concern about this:

  1. It will require the user give enough sample points for GP to build (in my experience, about 10). For now, my idea is to let user choose init_points+bayes_opt or init_grid+bayes_opt. init_points+init_points+bayes_opt feel a little wired.
  2. I'm not sure whether I should let user also input the "Pred" for the initial grid points. If so, then the user must use same cv fold all the time, otherwise user may have leakage problem for their model for model ensemble / stacking.

I would love to hear your comments and suggestion for this. Thank you,

Yachen

topepo commented 8 years ago

It's all possible but depends on how much you want to program around different types of input. If you let the user input a grid, you could use the logic that if the grid has a Value column, then use it, otherwise loop through and generate it (and whatever Pred returns from their function).

I'm not sure what you mean by

init_points+init_points+bayes_opt

I have not been using the Pred slot since I just want to get the final parameter values. If they give a Value column but don't pass in Pred (or pass zero like in the documentation), you can just make those entered NULL or NA in the output of your function.

yanyachen commented 8 years ago

Sorry for delayed reply, I was busy on work for the last two days. I have add a init_grid arguments as you suggested and pushed on Github. Please test if you have time. I will update CRAN version after test and add more examples for init_grid.

topepo commented 8 years ago

No problem. I tested it with the code below. Everything was fine although It does expect at least one additional value to be generated from init_points. There doesn't seem to be a way to only use the points that are seeded into the system. If that's what you want, you might add an information error message. Right now, it errors out with

Error in setnames(., old = names(.), new = DT_bounds[, Parameter]) : 'old' is length 1 but 'new' is length 2

Thanks

library(caret)
library(rBayesianOptimization)
library(data.table)

set.seed(1)
train_data <- twoClassSim(300)

set.seed(8796)
initial_mod <- train(Class ~ ., data = train_data,
                     method = "svmRadial",
                     preProc = c("center", "scale"),
                     tuneLength = 20,
                     metric = "ROC",
                     trControl = trainControl(method = "cv", search = "random",
                                              classProbs = TRUE, 
                                              summaryFunction = twoClassSummary))

initial_grid <- initial_mod$results[, c("C", "sigma", "ROC")]
initial_grid$C <- log(initial_grid$C)
initial_grid$sigma <- log(initial_grid$sigma)
names(initial_grid) <- c("logC", "logSigma", "Value")
initial_grid <- as.data.table(initial_grid)

svm_fit_bayes <- function(logC, logSigma) {
  txt <- capture.output(
    mod <- train(Class ~ ., data = train_data,
                 method = "svmRadial",
                 preProc = c("center", "scale"),
                 metric = "ROC",
                 trControl = trainControl(method = "cv", search = "random",
                                          classProbs = TRUE, 
                                          summaryFunction = twoClassSummary),
                 tuneGrid = data.frame(C = exp(logC), sigma = exp(logSigma)))
  )
  list(Score = getTrainPerf(mod)[, "TrainROC"], Pred = 0)
}

lower_bounds <- c(logC = -5, logSigma = -10)
upper_bounds <- c(logC = 15, logSigma = 5)
bounds <- list(logC = c(lower_bounds[1], upper_bounds[1]),
               logSigma = c(lower_bounds[2], upper_bounds[2]))

set.seed(914)
orig <- BayesianOptimization(svm_fit_bayes,
                             bounds = bounds,
                             init_points = 20, 
                             n_iter = 10,
                             acq = "ucb", 
                             kappa = 1, 
                             eps = 0.0,
                             verbose = TRUE)

set.seed(914)
add_more <- BayesianOptimization(svm_fit_bayes,
                               bounds = bounds,
                               init_grid_dt = initial_grid, 
                               init_points = 5,
                               n_iter = 10,
                               acq = "ucb", 
                               kappa = 1, 
                               eps = 0.0,
                               verbose = TRUE)

set.seed(914)
seeded <- BayesianOptimization(svm_fit_bayes,
                               bounds = bounds,
                               init_grid_dt = initial_grid, 
                               init_points = 0, ## doesn't work
                               n_iter = 10,
                               acq = "ucb", 
                               kappa = 1, 
                               eps = 0.0,
                               verbose = TRUE)
yanyachen commented 8 years ago

I have fixed the bug when init_points=0 or n_iter=0 I used your example codes below to check. The updated package has already been pushed to Github.

set.seed(914)
no_init_points <- BayesianOptimization(svm_fit_bayes,
                                       bounds = bounds,
                                       init_grid_dt = initial_grid, 
                                       init_points = 0,
                                       n_iter = 10,
                                       acq = "ucb", 
                                       kappa = 1, 
                                       eps = 0.0,
                                       verbose = TRUE)

set.seed(914)
no_n_iter <- BayesianOptimization(svm_fit_bayes,
                                  bounds = bounds,
                                  init_grid_dt = initial_grid, 
                                  init_points = 5,
                                  n_iter = 0,
                                  acq = "ucb", 
                                  kappa = 1, 
                                  eps = 0.0,
                                  verbose = TRUE)
topepo commented 8 years ago

I tried it on another data set for regression and everything worked perfectly. Thanks for making these changes.

mertyagli commented 5 years ago

Sorry for the 2-year-delayed update, but I have a question related on this change. In the first post, @topepo stated "it would be great to be able to pass in a data frame/table with the initial points.". -Is there any advantage of doing this? -What happens if you don't send initial points, and state the following instead? init_grid_dt = NULL, init_points = 20,

Thanks.