Frugal Optimization for Cost-related Hyperparameters

In principle, Line search with some modifications. The main contribution is the theoretical analysis based on convex optimization.

Main points

Add the following three modifications to line search:
1. Restart after stuck in a local optima
2. adaptive step-size based on the theoretical analysis
3. rounding of configurations for discrete spaces
The problem setting does not have the concept of fidelity, but it states that the runtime of $f(x)$ changes based on the choice of $x$.
For the analysis of cost-aware, it is important to have Condition 3.

Condition 3 states that if the costs for $x_1, x_2$ has the relationship of $g(x^\star) < g(x_1) < g(x_2)$, $f(x^\star) < f(x_1) < f(x_2)$ also holds. Note that $x^\star$ is a local optima. Intuitively speaking, longer training time does not improve the model if the model achieved good performance with the cost of $g(x^\star)$, but we would rather end up overfitting with more cost. Although this is not always the case, they cannot provide the theoretical analysis without this condition.

Experiments

They used the following methods:

Random search
BOHB
GPEI
GPEIPS (Expected improvement per second)
SMC
Their methods

Their benchmarks are self-made.

The visualizations are:

Performance over time
dataset-wise final performance distribution of each method
dataset-wise time-to-reach-best distribution of each method ==> I am not convinced of the distribution stuffs.

nabenabe0928 / reading-list

Frugal Optimization for Cost-related Hyperparameters #65

Main points

Experiments