Recently, did an experiment and used rastrigin function to figure out how well Ray works when we increase the dimensionality of search space. rastrigin has a lot of local minima, but it's global minimum is at 0 [Fig. 1].
The results of the experiment shows that as the dimensionality of the search space increases, the results deteriorate. It raised my concern if Ray is able to find best config when there are a lot of tunable hyperparameters. Maybe we need to define some trimming tools to trim the search space and reduce its dimensionally, or we can set some hyperparameters constant, especially those we think we might not get any benefit from.
Code:
from ray import train, tune
from ray.tune.schedulers import PopulationBasedTraining
import numpy as np
# rastrigin function.
def rastrigin(config):
x = list(config.values())
n = len(x)
score = 10*n + sum([xi**2 - 10*np.cos(2*np.pi*xi) for xi in x])
return {"score": score}
# plot rastrigin in 3D
# Note: it's global minimum is at zeros.
x = np.linspace(-5.12, 5.12, 100)
y = np.linspace(-5.12, 5.12, 100)
X, Y = np.meshgrid(x, y)
Z = rastrigin({"a":X, "b":Y})
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
ax.plot_surface(X, Y, Z['score'], cmap='viridis')
ax.set_xlabel('x')
ax.set_ylabel('y')
ax.set_zlabel('z')
# run ray experiment to find global minimum of n-D rastrigin func.
max_dim = 10
scores = []
for d in range(1, max_dim):
search_space = {f"var_{i}": tune.quniform(-2, 2, 0.05) for i in range(d)}
scheduler = PopulationBasedTraining(
time_attr="training_iteration",
hyperparam_mutations=search_space,
metric="score",
mode="max",
)
tuner = tune.Tuner(rastrigin,
param_space=search_space,
tune_config=tune.TuneConfig(
num_samples=50,
scheduler=scheduler,
),)
results = tuner.fit()
scores.append(results.get_best_result(metric="score", mode="min").metrics['score'])
The resulting loss value vs the dimensionality of search space.
Recently, did an experiment and used
rastrigin
function to figure out how well Ray works when we increase the dimensionality of search space.rastrigin
has a lot of local minima, but it's global minimum is at 0 [Fig. 1]. The results of the experiment shows that as the dimensionality of the search space increases, the results deteriorate. It raised my concern if Ray is able to find best config when there are a lot of tunable hyperparameters. Maybe we need to define some trimming tools to trim the search space and reduce its dimensionally, or we can set some hyperparameters constant, especially those we think we might not get any benefit from.Code:
The resulting loss value vs the dimensionality of search space.