YAHPO Gym (Yet Another Hyperparameter Optimization Gym) is a collection of interesting problem sets for benchmark hyperparameter optimization / black-box optimization methods described in this paper. The underlying software with additional documentation and background can be found here. See the module Documentation for more info.
YAHPO Gym distinguishes between scenarios
and instances
.
A scenario
is a collection of instances
that share the same hyperparameter space. In practice, a scenario
usually consists of a single algorithm optimized on a variety of datasets (= instances
).
This repository contains three modules/packages:
yahpo_gym
(python): The core package allowing for inference on the surrogates.yahpo_train
(python): Module for training surrogate models used in yahpo_gym
.yahpo_gym_r
(R): An R wrapper for yahpo_gym
.We also maintain a list of frequently asked questions.
NEWS:
YAHPO Gym
was accepted at the 1st International Conference on Automated Machine Learning!pip install yahpo-gym
YAHPO Gym (Yet Another Hyperparameter Optimization Gym) provides blazing fast and simple access to a variety of interesting benchmark problems for hyperparameter optimization. Since all our benchmarks are based on surrogate models that approximate the underlying HPO problems with very high fidelity, function evaluations are fast and memory friendly allowing for fast benchmarks across a large variety of problems.
Overview over benchmark instances
Scenario | Search Space | # Instances | Target Metrics | Fidelity | H | Source |
---|---|---|---|---|---|---|
rbv2_super | 38D: Mixed | 103 | 9: perf(6) + rt(2) + mem | fraction | ✓ | [1] |
rbv2_svm | 6D: Mixed | 106 | 9: perf(6) + rt(2) + mem | fraction | ✓ | [1] |
rbv2_rpart | 5D: Mixed | 117 | 9: perf(6) + rt(2) + mem | fraction | [1] | |
rbv2_aknn | 6D: Mixed | 118 | 9: perf(6) + rt(2) + mem | fraction | [1] | |
rbv2_glmnet | 3D: Mixed | 115 | 9: perf(6) + rt(2) + mem | fraction | [1] | |
rbv2_ranger | 8D: Mixed | 119 | 9: perf(6) + rt(2) + mem | fraction | ✓ | [1] |
rbv2_xgboost | 14D: Mixed | 119 | 9: perf(6) + rt(2) + mem | fraction | ✓ | [1] |
nb301 | 34D: Categorical | 1 | 2: perf(1) + rt(1) | epoch | ✓ | [2], [3] |
lcbench | 7D: Numeric | 34 | 6: perf(5) + rt(1) | epoch | [4], [5] | |
iaml_super | 28D: Mixed | 4 | 12: perf(4) + inp(3) + rt(2) + mem(3) | fraction | ✓ | [6] |
iaml_rpart | 4D: Numeric | 4 | 12: perf(4) + inp(3) + rt(2) + mem(3) | fraction | [6] | |
iaml_glmnet | 2D: Numeric | 4 | 12: perf(4) + inp(3) + rt(2) + mem(3) | fraction | [6] | |
iaml_ranger | 8D: Mixed | 4 | 12: perf(4) + inp(3) + rt(2) + mem(3) | fraction | ✓ | [6] |
iaml_xgboost | 13D: Mixed | 4 | 12: perf(4) + inp(3) + rt(2) + mem(3) | fraction | ✓ | [6] |
The full, up-to-date overview can be obtained from the Documentation.
The fidelity is given either as the dataset fraction fraction
or the number of epochs epoch
.
Search spaces can be numeric, mixed and have dependencies (as indicated in the H
column).
Original data sources are given by:
Please make sure to always also cite the original data sources as YAHPO Gym would not have been possible without them!
This repository contains two modules: yahpo_gym
and yahpo_train
.
While we mainly focus on yahpo_gym
, as it is provides an interface to the benchmark described in our paper,
we also provide the full reproducible codebase used to generate the underlying surrogate neural networks in yahpo_train
.
YAHPO Gym is the module for inference and allows for evaluating a HPC configuration on a given benchmark instance.
Surrogate models (ONNX files), configspaces and metadata (encoding) can be obtained here (Github).
An example for evaluation and running HPO methods is given in the README of the YAHPO Gym module.
A quick introduction is given in the accompanying jupyter notebook.
YAHPO Train is the module for training new surrogate models.
YAHPO Train is still in a preliminary state but can already be used to reproduce and refit models introduced in our paper.
A docker
image that allows accessing yahpo_gym
is available from DockerHub
at pfistfl/yahpo. This adds additional overhead but simplifies use and installation.
The corresponding dockerfile to get you started can be found in docker/
.
We want to add several features to yahpo_gym in future versions:
objective_function_timed
, but requires additional (experimental) evaluation for release.noisy = True
during instantiation, but this feature is considered experimental and
requires additional evaluation for release.Additional Scenarios We are always happy to include additional (interesting) scenarios. If you know of (or want to add) an additional scenario, get in touch!
We welcome input, discussion or additions by the broader community. Get in touch via issues or emails if you have questions, comments or would like to collaborate!
rbv2_*
in a real setting.iaml_*
in a real setting.If you use YAHPO Gym, please cite the following paper:
Moreover, certain scenarios
built upon previous work, e.g., the lcbench
scenario uses data from:
Please make sure to always also cite the original data sources as YAHPO Gym would not have been possible without them!
Original data sources of a scenario that should also be cited are provided via the "citation"
key within the config
dictionary of a scenario, e.g.:
from yahpo_gym.configuration import cfg
lcbench = cfg("lcbench")
lcbench.config.get("citation")