mlr-org / mlr3tuning

Hyperparameter optimization package of the mlr3 ecosystem
https://mlr3tuning.mlr-org.com/
GNU Lesser General Public License v3.0
55 stars 5 forks source link

AutoTuner gives error with custom resampling #371

Closed m-muecke closed 1 year ago

m-muecke commented 1 year ago

Description

The AutoTuner gives me with a custom resampler the following error message: Error: Resampling 'custom' may not be instantiated

Reproducible example

library(mlr3verse)

task = tsk("penguins")
learner = lrn("classif.lightgbm",
  num_iterations = to_tune(1, 5000),
  max_depth = to_tune(1, 20),
  lambda_l1 = to_tune(1e-3, 1e3, logscale = TRUE)
)
resampling = rsmp("custom")
resampling$instantiate(task,
  train = list(c(1:10, 51:60, 101:110)),
  test = list(c(11:20, 61:70, 111:120))
)
at = auto_tuner(
  method = tnr("random_search"),
  learner = learner,
  resampling = resampling,
  measure = msr("classif.ce"),
  terminator = trm("evals", n_evals = 5)
)
be-marc commented 1 year ago

Sorry we cannot allow instantiated resamplings because nested resampling does not work with instantiated resamplings in the inner loop. The "custom" resampling is always instantiated. If you don't do nested resampling, you could alternatively use the function tune(). Almost the same length of code.

task = tsk("penguins")
learner = lrn("classif.lightgbm",
  num_iterations = to_tune(1, 5000),
  max_depth = to_tune(1, 20),
  lambda_l1 = to_tune(1e-3, 1e3, logscale = TRUE)
)
resampling = rsmp("custom")
resampling$instantiate(task,
  train = list(c(1:10, 51:60, 101:110)),
  test = list(c(11:20, 61:70, 111:120))
)
instance = tune(
  method = tnr("random_search"),
  task = task,
  learner = learner,
  resampling = resampling,
  measure = msr("classif.ce"),
  terminator = trm("evals", n_evals = 5)
)

learner$param_set$values = instance$result_learner_param_vals
learner$train(task)
sebffischer commented 1 year ago

But it is possible to use the AutoTuner with the custom resampling right? I think the better solution might be to allow it but throw an error in case the instantiated resampling of the inner loop conflicts with the outer loop.

sebffischer commented 1 year ago

This test only has to be done when the resampling is instantiated so it does not cost anything otherwise.

be-marc commented 1 year ago

throw an error in case the instantiated resampling of the inner loop conflicts with the outer loop

That was not so easy to check. At least that's what we thought when we decided to throw an error. You only have access to both resamplings in resample() but mlr3 does not know what an AutoTuner is or that tuning exists at all. The Task in AutoTuner$train() is already subsampled. However, it still has the right row ids. Maybe we changed that in the last two years. So a check might be possible with task$row_ids() now. We should discuss this in the next call. It could be that the result of nested resampling is biased again.

be-marc commented 1 year ago

Solved by #372. Thanks for you comments!