Open fabio-ciani opened 2 weeks ago
As follow-up note, it is advisable to limit concurrency after the initialization phase of Tuner
. Overriding this option can be crucial, in particular when considering inherently sequential methods like Bayesian optimization. (See #13962.)
import math
import time
from ray import tune
from ray.tune.search import ConcurrencyLimiter
from ray.tune.search.bayesopt import BayesOptSearch
def objective(config):
time.sleep(10.0)
x = config["x"]
return {"score": x ** 3 * math.log(x) - x ** 2}
# Exploration.
search_space = {"x": tune.uniform(0.0, 2.0)}
algo = BayesOptSearch(random_search_steps=10, patience=1000)
tuner = tune.Tuner(
objective,
tune_config=tune.TuneConfig(num_samples=10, metric="score", mode="min", search_alg=algo),
param_space=search_space
)
tuner.fit()
# Exploitation.
tuner = tune.Tuner(
objective,
tune_config=tune.TuneConfig(num_samples=10, metric="score", mode="min", search_alg=ConcurrencyLimiter(algo, max_concurrent=1)),
)
results = tuner.fit()
print(results.get_best_result().config)
This represents an alternative solution that feels more natural and tailored to the use-case scenario.
Hey @hongpeng-guo @justinvyu just following up on this one, any insight here? Thx
Description
Bug
If the score computation associated to a trial of hyperparameter tuning takes too long, the patience of
BayesOptSearch
saturates because of duplicated configurations.This is likely due to other tasks being launched simultaneously to mask and solve the slowness of the original task, trying to complete its uncommitted work.
For example, see the attached code snippet. The procedure should sample 10 random points to initialize the search, and then do another 10 steps. However, at the 11th iteration, the call stops, meaning that 9 points are missing and will never be sampled.
Solution
Apart from more sensible fixes, the issue can be avoided by increasing the
patience
argument of the optimizer.Related issues
13234, #28063.
System info
Hardware
Operating System: Ubuntu 24.04 LTS Kernel: Linux 6.8.0-39-generic Architecture: x86-64
Software
Python 3.12.3
bayesian-optimization==1.5.1
ray==2.37.0
Example script
Severity
Medium: It is a significant difficulty but I can work around it.