MLBazaar / BTB

A simple, extensible library for developing AutoML systems
https://mlbazaar.github.io/BTB/
MIT License
172 stars 41 forks source link

Integrate SMAC with benchmarking #195

Closed pvk-developer closed 4 years ago

pvk-developer commented 4 years ago

Integrate SMAC: Sequential Model-based Algorithm Configuration with our benchmarking.

How to integrate SMAC3 tuners:

All SMAC3 tuners have the same API: Tuners are classes that are used by calling a single optimize method that returns the best hyperparameter configuration found. The user must create a Scenario object with the following arguments:

Once created this Scenario we can create the instance of the tuner that receives as arugments:

Having our tuner instance configured we can call the optimize method to return the best hyperparameter configuration.

Because the tuner celery is different from what our benchmark expects, we will need to create a wrapper function that:

SMAC3 tuners:

SMAC4HPO Bayesian optimization using a Random Forest model of pyrfr. Self-implementation explained in the following paper. We want to implement this tuner as its main functionality is to find the optimal configuration for a machine learning algorithm. We want to use this algorithm for all kinds of challenges we have in our library.

HB4AC (Hyperband) Uses Successive Halving for proposals. Self-implementation explained in the following paper. We are interested in this hyperparameter optimization algorithm.

ROAR (Random Online Aggressive Racing) Select and test uniformly random parameter settings. Self-implementation explained in the [paper. We don't want to implement it because this algorithm is equivalent to our UniformTuner.

SMAC4AC (Configuration Algorithm) Select parameter settings using a RandomForest from the pyrfr library. Self-implementation explained in the following paper . We don't want to implement it because it is proposed to optimize the execution time of different algorithms. We are more interested in a machine learning model hyperparameter optimization algorithm.

SMAC4BO Bayesian optimization using a Gaussian Process model from the skopt library. Own implementation explained in the following paper. We are not interested in its integration as its implementation is aimed at tuning small values ​​from 0 to 1.0.

BOHB4HPO (Bayesian Optimization (BO) and Hyperband (HB)) Start using "successive halving" until you reach a number of configurations to be able to train a model (GaussainProcess) and start using it. Own implementation. The proposal for this model can be found in the following paper. We are not interested in its integration as it has the same deficit as SMAC4BO.