Open j-adamczyk opened 1 year ago
In the current version of Optuna, there is no way to easily perform multiprocessing inside a single Python script.
Since the concurrent.futures
module provides a high-level API, I think it would basically work if you changed it to ProcessPoolExecutor
like the following.
import optuna
# from concurrent.futures import ThreadPoolExecutor
from concurrent.futures import ProcessPoolExecutor as ThreadPoolExecutor
def objective(trial):
x = trial.suggest_float("x", -100, 100)
y = trial.suggest_float("y", -100, 100)
return x**2 + y
def main():
study = optuna.create_study(storage="sqlite:///db.sqlite3") # Please use RDBStorage, JournalStorage or DaskStorage.
with ThreadPoolExecutor(max_workers=5) as pool:
for i in range(5):
pool.submit(study.optimize, objective, n_trials=10)
print(f"Best params: {study.best_params}")
if __name__ == '__main__':
main()
Using concurrent.futures
makes it unnecessary for users to install joblib as an additional dependency and simplifies Optuna's source code. What do you think?
@c-bata that's a nice solution, I think even from concurrent.futures import ProcessPoolExecutor as ThreadPoolExecutor
would suffice, as long as I import this before Optuna. However, this has 2 downsides:
n_jobs
has different meaning for Scikit-learn and its integration with Optuna.But this would make an easy change for Optuna behavior. Simply using either ProcessPoolExecutor
or ThreadPoolExecutor
in _optimize()
would suffice for many use cases. However, arbitraty executors should also be supported, since they may offer major advantages. Most notably, loky backend and executor is a more robust solution than plain multiprocessing, and faster e.g. when using Numpy arrays. Using Joblib would make all those 3 options use the same API, but is not strictly necessary.
However, Scikit-learn already depends on Joblib, so a large chunk of Optuna users already depend on it anyway. Also, Optuna used to depend on Joblib, since older issues reference it. This is a relatively self-contained and lightweight dependency.
I see 3 options:
_optimize()
and other functions to support switching between the two. Does not add any dependencies and is simpel, but is less robust and slower (when using Numpy at least) than option 3._optimize()
, with threads as the default backend, but with processes and Loky options. This is the most robust solution and still easy to implement, but adds a dependency (but a small and common one).Hi, Any updates regarding this feature request?
@okaikov unfortunately not, as far as I know. I see this as a major problem with Optuna, and I am currently researching other frameworks. Potentially using JoblibStudy from this PR (which sadly also got closed, which means the problem is still there) may be good for your use case, but it requires copy-pasting that code.
makes it unnecessary for users to install joblib as an additional dependency
This is not a serious problem. joblib is already used by many important libraries. It gets pulled as a dependency as soon as you install something as widespread as scikit-learn.
Not having true multiprocessing in Optuna is a significant limitation at this point.
One problem that the joblib workaround does not solve is that, when you have multiple separate processes, each running study.optimize()
, there is no shared in-memory storage. You have to use a shared external storage, which can be slow. I would very much like to run efficient multiprocessing search with Optuna with the in-memory storage, but right now I can't.
One problem that the joblib workaround does not solve is that, when you have multiple separate processes, each running study.optimize(), there is no shared in-memory storage.
This leads to the problem that even using the provided alternative by @j-adamczyk , you are running N processes with the same set of hyperparameters... Does anyone have a workaround for the shared memory in order to avoid having the same set of hyperparameters N times? If I'm not wrong, this also affects to your configured pruning strategy.
True multiprocessing would indeed be very helpful. I would really love to run multiple trials simultaneously on a server with 4+ gpus and get my results faster. Currently I just parallelize the model on how much ever gpus are available, but multiple trials will be helpful.
Motivation
In the current version of Optuna, there is no way to easily perform multiprocessing inside a single Python script. Running multiple terminals is impossible in automated cloud environments, and even if this is possible, it's plainly a bad design. Scikit-learn and related APIs support
n_jobs
exactly for this purpose - if I setn_jobs=-1
, I utilize all my CPU cores. Of course, using threads first is a reasonable design for Optuna, but there should be an option to change this behavior to use processes instead.This would be useful for CPU-bound jobs there a single job cannot be parallelized easily. The most common use case is SVM in Scikit-learn, which is single-threaded, but it requires extensive hyperparameter tuning, which can be done in parallel. Another use case is training multiple neural networks on the same GPU, within multiple CPU processes, e.g. Graph Neural Networks (GNNs).
An additional advantage would be that
OptunaSearchCV
would have the same meaning ofn_jobs
as the Scikit-learn, which it integrates with.Description
The problem lies in the
_optimize
function, here:Since thread-based executor is hardcoded here, there is no way to specify anything else. Even setting the Joblib backend, which used to work, cannot work here.
However, if we could specify the executor, the user could use any backend supported by Joblib: regular Python multithreading or multiprocessing, Loky (efficient multiprocessing, default in Scikit-learn) or even anything else. I suggest using Joblib, as this is the easiest and the most flexible option, also arguably the most popular.
There would be 2 changes required:
parallel_backend
option to_optimize()
and function that are calling it, specifying the Joblib backend to use.joblib.Parallel()
instead ofThreadPoolExecutor
in_optimize()
for multiple jobs case.Note that this does not require any changes to the RDB backend, as this is exactly equivalent to running Optuna in separate terminals.
Alternatives (optional)
Currently, the only alternative is to manually launch multiple Optuna trials via Joblib (taken from here):
However, this requires a manual wrapper around a core, important functionality. I have used this approach in multiple projects, but copy-pasting this so many times makes me feel like this should just be built in.
Additional context (optional)
No response