araffin / rl-baselines-zoo

A collection of 100+ pre-trained RL agents using Stable Baselines, training and hyperparameter optimization included.
https://stable-baselines.readthedocs.io/
MIT License
1.12k stars 208 forks source link

Clarification Optuna optimization n-jobs #42

Closed giulioturrisi closed 4 years ago

giulioturrisi commented 4 years ago

Hi, i would like to ask for a clarification in how optuna is actually implemented. Suppose i have a DummyVecEnv, and in the configuration file I have n_envs = 4. I know that they will run in just one core of the CPU sequentially. But what happens if i launch the optuna optimization with a number of jobs = 2? Do Optuna automatically will use two cores, putting 4 env in one core e 4 to the another?

Thank you!

araffin commented 4 years ago

Hello,

This look like more a question for optuna, but I would say yes, it should create two processes and so use two cores with 4 envs on each of those.

Looking at the code of optuna, it uses a multiprocessing thread pool (apparently undocumented feature of python).

giulioturrisi commented 4 years ago

Thank you for the answer. In my machine it seems to make the computation longer (therefore it seems optuna is not parallelizing!), but i will investigate a little bit more

sile commented 4 years ago

Hi,

Thank you for using Optuna in this great library!

I would like to answer about Optuna's behavior when n_jobs > 2 is specified at Study.optimize method. In such cases, Optuna spawns multiple threads to execute the optimization. However, Python's GIL (Global Interpreter Lock) prevents those threads to run in parallel. Thus, typically, you can only parallelize I/O heavy workload by specifying n_jobs option.

To utilize your CPU cores, you need to parallelize your objective function or use Distributed Optimization feature.

araffin commented 4 years ago

@sile thanks for the clarification (in fact, I was afraid of the GIL when I read ThreadPool) and thank you this awesome library which is Optuna =).