DLR-RM / rl-baselines3-zoo

A training framework for Stable Baselines3 reinforcement learning agents, with hyperparameter optimization and pre-trained agents included.
https://rl-baselines3-zoo.readthedocs.io
MIT License
1.98k stars 508 forks source link

[Feature request] Restore n-jobs > 1 functionality #200

Closed nhewitt99 closed 2 years ago

nhewitt99 commented 2 years ago

Not sure if this is the right tag. I'm following up on #79 , where it was pointed out that Optuna deprecated the n-jobs argument for optimization. As best as I can tell, although the argument still exists in train.py and exp_manager.py, it is ignored during optimization. When testing with the following, only one process is spun up on my GPU: python train.py --algo ppo --env MountainCar-v0 -n 50000 -optimize --n-trials 1000 --n-jobs 2 --sampler tpe --pruner median --verbose 1 --study-name mountain_car_1 --storage sqlite:///optuna.db

Is there interest in restoring this functionality? I've looked over the optuna issue and relevant docs, and mocked up a solution with multiprocessing in my own fork. However, my experience with mp is amateurish (at best!). I'm curious if anyone has input on this before I rush into polishing it up and submitting a PR.

araffin commented 2 years ago

Hello, distributed optimization is supported by the zoo: https://github.com/DLR-RM/rl-baselines3-zoo#hyperparameter-tuning

You need to launch several python train.py --algo ppo --env MountainCar-v0 -n 50000 -optimize --n-trials 1000 --sampler tpe --pruner median --verbose 1 --study-name mountain_car_1 --storage sqlite:///optuna.db in different terminals. The n-jobs only acts with threads I think.

You should probably read https://github.com/DLR-RM/rl-baselines3-zoo/issues/153 too

nhewitt99 commented 2 years ago

Sounds good! I've used the multiple terminals workaround before, but I was confused about whether n-jobs was supposed to provide this functionality without spinning up multiple scripts. Thanks for the reply; I'll go ahead and close this :)