LucasAlegre / morl-baselines

Multi-Objective Reinforcement Learning algorithms implementations.
https://lucasalegre.github.io/morl-baselines
MIT License
271 stars 44 forks source link

Parallel Hyperparameter Search #84

Closed lowlypalace closed 1 month ago

lowlypalace commented 8 months ago

PR Description

This PR makes hyperparameter search more scalable by running the training loop for each agent in parallel. This is especially useful when running the search on a cluster (e.g. Slurm).

TODO:

Example Configs on a Slurm Cluster

Using 4 GPUs + 4 workers

#SBATCH --nodes=1 # node count
#SBATCH --ntasks=1 # total number of tasks across all nodes

#SBATCH --cpus-per-task 4 # number of processes
#SBATCH -G 4

python experiments/hyperparameter_search/launch_sweep.py \
--algo envelope \
--env-id minecart-v0 \
--sweep-count 100 \
--seed 10 \
--num-seeds 4 \
--num-workers 4 \
--devices cuda:0 cuda:1 cuda:2 cuda:3 

Using 4 CPUs + 4 workers

#SBATCH --nodes=1 # node count
#SBATCH --ntasks=1 # total number of tasks across all nodes

#SBATCH --cpus-per-task 4 # number of processes

python experiments/hyperparameter_search/launch_sweep.py \
--algo envelope \
--env-id minecart-v0 \
--sweep-count 100 \
--seed 10 \
--num-seeds 4 \
--num-workers 4 

Each worker will use auto and then each algo instance will default to cpu as CUDA is not available.

Example Runs on a Slurm Cluster

Example Runs:   Workers     CPUs    GPUs CPU Usage GPU Usage Sweeps
4 4 0 94.88% N/A 18
4 1 0 95.55% N/A 15
1 1 0 25.03% N/A 15
4 4 1 18.78% 99% 5
4 4 4 31.07% 9%
11%
12%
10%
5
4 1 4 98.85% 4%
5%
5%
5%
13
lowlypalace commented 8 months ago

Any idea why black is complaining as I've run the linter on my end?