Bug: When using Optuna as the search algorithm in Ray Tune, the performance reduces significantly, CPU utilization decreases, and the number of trials is limited to 1.
Expected behavior: Optuna integration should maintain performance comparable to other search algorithms, utilize CPU resources efficiently, and allow for multiple trials as specified.
Detailed OS Information:
OS: Linux
OS Version: #1 SMP PREEMPT_DYNAMIC Sat Jun 29 07:01:04 UTC 2024
OS Release: 6.6.36.3-microsoft-standard-WSL2
Machine: x86_64
Processor: x86_64
Reproduction script
The execution time for 100 trials shows a dramatic difference:
With Optuna: 41 seconds
Without Optuna: 9 seconds
import numpy as np
from sklearn.ensemble import RandomForestRegressor
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error
import ray
from ray import tune
from ray.tune.search.optuna import OptunaSearch
from ray import train, tune
def generate_data(n_samples=1000):
X = np.random.rand(n_samples, 5)
y = 2 * X[:, 0] + 3 * X[:, 1] - X[:, 2] + 0.5 * X[:, 3] - 1.5 * X[:, 4] + np.random.normal(0, 0.1, n_samples)
return X, y
def train_random_forest(config):
X, y = generate_data()
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
rf = RandomForestRegressor(
n_estimators=config["n_estimators"],
)
rf.fit(X_train, y_train)
y_pred = rf.predict(X_test)
mse = mean_squared_error(y_test, y_pred)
train.report({"mean_squared_error": mse,})
def main():
ray.init()
config = {
"n_estimators": tune.randint(10, 200),
}
analysis = tune.run(
train_random_forest,
config=config,
num_samples=100,
search_alg=OptunaSearch(),
metric="mean_squared_error",
mode="min",
reuse_actors=True
)
print("Best config:", analysis.best_config)
print("Best MSE:", analysis.best_result["mean_squared_error"])
if __name__ == "__main__":
main()
What happened + What you expected to happen
Bug: When using Optuna as the search algorithm in Ray Tune, the performance reduces significantly, CPU utilization decreases, and the number of trials is limited to 1.
Expected behavior: Optuna integration should maintain performance comparable to other search algorithms, utilize CPU resources efficiently, and allow for multiple trials as specified.
Versions / Dependencies
Ray version: 2.32.0 Python version: 3.11.8 Operating System: Linux-6.6.36.3-microsoft-standard-WSL2-x86_64-with-glibc2.35 Optuna version: 3.6.1
Detailed OS Information: OS: Linux OS Version: #1 SMP PREEMPT_DYNAMIC Sat Jun 29 07:01:04 UTC 2024 OS Release: 6.6.36.3-microsoft-standard-WSL2 Machine: x86_64 Processor: x86_64
Reproduction script
The execution time for 100 trials shows a dramatic difference:
With Optuna: 41 seconds Without Optuna: 9 seconds
Issue Severity
High: It blocks me from completing my task.