This feature introduces a new script to perform a sweep of multi-objective reinforcement learning (MORL) algorithms and environments. The script runs a series of experiments, collects performance metrics, and logs the results to Weights & Biases (W&B).
The training is performed with multiple seeds in parallel, leveraging the ProcessPoolExecutor to run each agent with a different seed concurrently. By running the training on a series of seeds, the script accounts for the variability in the learning process and provides a more comprehensive evaluation of the algorithms' performance. The average hypervolume metric, obtained from the results of training on different seeds, is computed and logged to Weights & Biases.
Components Description
The main components of the feature are:
Argument parsing: Parse command-line arguments for the algorithm, environment ID, reference point, W&B entity, project name, number of seeds, and training hyperparameters.
Worker classes: Define classes to handle worker setup and results, including WorkerInitData and WorkerDoneData.
Train function: Implement a train function to instantiate the selected algorithm, train the agent, and return the hypervolume metric.
Main function: Initialize W&B, create a process pool of workers, submit tasks to the workers, collect results, compute the average hypervolume, and log the metrics to W&B.
Sweep setup and execution: Load the sweep configuration, set up the sweep with W&B, and run the sweep agent using the main function.
The script allows users to easily perform a sweep of MORL algorithms and environments, exploring different hyperparameters and logging the results to W&B for further analysis.
The configs with the ranges of hyperparameters for the sweep should be placed in configs directory with the corresponding algorithm name, such as envelope.yaml.
Other Changes
Additionally, the PR does a reorg of file structure and moves some of the functions that are used by both launch_experiment.py and launch_sweep.py into common/experiments.py.
Recreating from #57
Solves #13
Paper: https://arxiv.org/abs/2310.16487
Feature Description
This feature introduces a new script to perform a sweep of multi-objective reinforcement learning (MORL) algorithms and environments. The script runs a series of experiments, collects performance metrics, and logs the results to Weights & Biases (W&B).
The training is performed with multiple seeds in parallel, leveraging the
ProcessPoolExecutor
to run each agent with a different seed concurrently. By running the training on a series of seeds, the script accounts for the variability in the learning process and provides a more comprehensive evaluation of the algorithms' performance. The average hypervolume metric, obtained from the results of training on different seeds, is computed and logged to Weights & Biases.Components Description
The main components of the feature are:
The script allows users to easily perform a sweep of MORL algorithms and environments, exploring different hyperparameters and logging the results to W&B for further analysis.
Usage
An example usage:
The configs with the ranges of hyperparameters for the sweep should be placed in
configs
directory with the corresponding algorithm name, such asenvelope.yaml
.Other Changes
Additionally, the PR does a reorg of file structure and moves some of the functions that are used by both
launch_experiment.py
andlaunch_sweep.py
intocommon/experiments.py
.