mlr-org / mlr3tuning

Hyperparameter optimization package of the mlr3 ecosystem
https://mlr3tuning.mlr-org.com/
GNU Lesser General Public License v3.0
53 stars 5 forks source link

retrain #277

Closed be-marc closed 3 years ago

be-marc commented 3 years ago

Overview

This PR allows to use continuable models in tuning. The tuner can propose models that should be continued in the objective function instead of fitting a new model.

Contract

The Tuner is solely responsible to choose models that should be continued. If the Tuner wants to continue a model, it passes the uhash of the original model and the old hyperparameter configuration with an increased budget to ObjectiveTuning.

The ObjectiveTuning can perform some extra checks before continuing a model e.g. does the learner support $continue(). These checks are always the same and therefore shouldn't be reimplemented in each tuner.

Design issue

We only allow to pass hyperparameter configuration from the tuner to the objective function (Tuner$.optimize() -> TuningInstance*$eval_batch(xdt) -> ObjectiveTuning$.eval_many(xss). Yes we allow to pass extra information in xdt but these are only stored in the archive and not available in $.eval_many(xss). xss is a list of lists of hyperparameter configurations. This design is specified by bbotk rather than mlr3tuning.

Currently, we have two issues which are caused by this design issue.

1) TunerIrace want to decide the resampling on which the hyperparameter configurations are evaluated. Currently, the resampling is more or less fixed. 2) TunerHyperband should be able to continue models. For this, the hyperparameter configurations in xss (with an increased budget) must be connected to already evaluated models in the archive.

Option 1 - Active Bindings

The Tuner can alter the objective function via active bindings. Before calling $eval_batch() the new resampling is set and a continue hash can be passed. See this PR for an implementation.

instance$objective$continue_hash = uhash
instance$objective$resampling = resampling 
instance$eval_batch(xdt)

I used this method in #227 and this PR since it requires no change in bbotk. I like this solution because we can move some logic to the active bindings instead of ẃritting an increasingly long $.eval_many() method e.g. checking if the new resampling is already instantiated. However, I admit that it feels more natural to pass the hyperparameter configurations along with the corresponding resampling and continue hash in one call (Option 2).

Check out https://github.com/mlr-org/mlr3hyperband/blob/ae5ddb785cceb882fab3e64cea5a872fa0decdf4/R/TunerSuccessiveHalving.R to see what the Tuner is doing (Line 93).

Option 2 - Extras in $eval_batch call

a) The Tuner can pass resampling and continue hash along with the hyperparameter configurations. We would need to overload ObjectiveTuning$eval_many() and TuningInstance*$eval_batch() to allow these extra information to be passed to the objective function. I don't like this solution at all since we have to reimplement abstract methods from bbotk that work just fine.

instance$eval_batch(xdt, resampling, continue_hash)

b) The Tuner can pass extra information through a list to the objective function. We should change bbotk for this. Might be the most flexible solution.

instance$eval_batch(xdt, extra = list(resampling, continue_hash))

Option 3 - Config Space

We use specialized classes like TuningInstanceIrace / ObjectiveTuningIrace and TuningInstanceHyperband / ObjectiveTuningHyperand. In addition to the hyperparameter configurations, xss stores configuration parameter (resamplings and continue_hashes). We split those in $.eval_many() by having a configuration parameter set in the ObjectiveTuning* classes.

be-marc commented 3 years ago

new approach in #297