hyperband and CV. is this combinable?

SebGGruber commented 4 years ago

It's already working as it is by replacing holdout with cv. The question is what are the consequences compared to ho?

better estimation in each successive halving iteration --> we are more confident the survivors actually deserve to survive
more accurate overall outcome
hugely increases runtime

I think it makes great sense as we are less likely to waste computing power/time on "lucky" configs in advanced bracket stages, plus the usual CV benefits.

An example with 3-fold CV:

devtools::load_all()
library(mlr3learners)
set.seed(123)

# define hyperparameter and budget parameter for tuning with hyperband
ps = ParamSet$new(list(

  ParamInt$new("nrounds",           lower = 1, upper = 16, tag = "budget"),
  ParamDbl$new("eta",               lower = 0, upper = 1),
  ParamInt$new("num_parallel_tree", lower = 1, upper = 100),
  ParamInt$new("max_depth",         lower = 1, upper = 100),
  ParamFct$new("normalize_type", levels = c("tree", "forest")),
  ParamFct$new("sample_type",    levels = c("uniform", "weighted")),
  ParamFct$new("booster",        levels = c("gbtree", "gblinear", "dart"))
))

# tuning instance with 3-fold CV
inst = TuningInstance$new(
  tsk("iris"),
  lrn("classif.xgboost"),
  rsmp("cv", folds = 3),
  msr("classif.ce"),
  ps,
  term("evals", n_evals = 100000)
)

# hyperband + tuning
tuner = TunerHyperband$new(eta = 2L)
tuner$tune(inst)

print(inst$archive())
print(tuner$info)

berndbischl commented 4 years ago

@SebGruber1996 simply make sure that this work also in a unit test (with a small CV) then we are done here

SebGGruber commented 4 years ago

added 2 fold CV to the unit tests

pfistfl commented 4 years ago

One other question would be if we can for example only do 1 out of 10 CV folds in the beginning and then subsequently increase this. I.e. can we somehow treat the number of folds we want to evaluate as a hyperparameter.

juliambr commented 4 years ago

One other question would be if we can for example only do 1 out of 10 CV folds in the beginning and then subsequently increase this. I.e. can we somehow treat the number of folds we want to evaluate as a hyperparameter.

I opened a new issue for that #56.

juliambr commented 4 years ago

There was a bug regarding the evaluation of configs with cross-validation #50. Tests did not cover this case, so maybe tests should be improved.

mlr-org / mlr3hyperband

hyperband and CV. is this combinable? #5