Generalization of Heterogeneous Multi-Robot Policies via Awareness and Communication of Capabilities | CoRL 2023
This codebase was used for the expeirments in Generalization of Heterogeneous Multi-Robot Policies via Awareness and Communication of Capabilities

Pierce Howell, Max Rudolph, Reza Torbati, Kevin Fu, & Harish Ravichandar. Generalization of Heterogeneous Multi-Robot Policies via Awareness and Communication of Capabilities, 7th Annual Conference on Robot Learning (CoRL), 2023

Table of Contents

Installation Instructions

Download the Repository

git clone ....
cd ...
git submodule init # Initialize the MARBLER submodule

Python Environment


Create the anaconda environment for python 3.8

conda create -n cap-comm python=3.8 pip
conda activate cap-comm

Install pytorch for the specifications of your system. See Pytorch installation instructions.

Install requirements

pip install -r requirements.txt

Multi-Particle Environment

Now the multi-agent particle environment must be installed. cd into the mpe directory and run

pip install -e .


To install the dependencies, follow the installation instructions in the MARBLER repo README.

Download Pre-trained Models

Download and extract the models used for the results in paper

cd [repo_path]/cap-comm/
mkdir pretrained_models && cd pretrained_models

wget -O

wget -O ""


The trained models need to be moved to the eval folder under the correct experiment. From within pretrained_models, run the following:

cp -r "mpe:MaterialTransport-v0/experiments/*" "../eval/eval_experiments_and_configs/mpe:MaterialTransport-v0/experiments/"
cp -r "robotarium_gym:HeterogeneousSensorNetwork-v0/*" "../eval/eval_experiments_and_configs/robotarium_gym:HeterogeneousSensorNetwork-v0/experiments/"


The evaluations can be reproduced using the installed pretrained models. The evaluation process is comprised of two stages: i) data collection and ii) reporting. During data collection, the trained models are deployed on the target environment and evaluation metrics are recorded. The data collection script will load models (at each seed) from the eval/eval_experiments_and_configs/[ENVIRONMENT_NAME]/experiments directory and the evaluation configuration from eval/eval_experiments_and_configs[ENVIRONMENT_NAME]/eval_configs. For reporting, the collected data is plotted and the corresponding plots are saved as figures in the directory eval/eval_experiments_and_configs/mpe:MaterialTransport-v0/eval_figures.

Heterogeneous Matieral Transport Environment (HMT)

Data Collection

cd [REPO_PATH]/cap-comm/eval

All the models will be ran for each config. Please see the scripts eval/ and eval/ for more details on how the evaluations are performed. Note, eval/ handles running eval/ using multi-processing for faster evaluation.


cd [REPO_PATH]/cap-comm/eval

# begin the jupyter notebook
jupyter notebook

Open jupyter notebook in a local browser, then open the file mpe_material_transport_evaluation_reporting.ipynb. Go ahead and run all the cells to produced the evaluation figures. The figures will be saved in the directory eval/eval_experiments_and_configs/mpe:MaterialTransport-v0/eval_figures.

Heterogeneous Sensor Network Environment (HSN)

Data Collection

cd [REPO_PATH]/cap-comm/eval


cd [REPO_PATH]/cap-comm/eval

# begin the jupyter notebook
jupyter notebook

Within the jupyter notebook file directory, open the file marbler_hsn_evaluation_reporting.ipynb. Run all the cells to produce the evaluation figures. The figures will be saved in the directory eval/eval_experiments_and_configs/robotarium_gym:HeterogeneousSensorNetwork-v0/eval_figures.

Training New Models

The models are trained using the EPyMarl training framework. This section demonstrates the commands that were run to generate the policies used in the paper.

Heterogeneous Material Transport Environment (HMT)

Before executing training, it is important to verify that the configuration of the environment is correct. At this time, command line arguments DO NOT override the environment-specific configurations. These must be configured in the mpe/mpe/scenarios/configs/material_transport/base_config.yaml. The main configuration parameters to change (depending on the experiment) are the following:

The commands for the experiments are provided in the /scripts/mpe:MaterialTransport-v0 directory:

# Run the GNN with capability awareness (i.e. CA+CC (GNN))
# set capability_aware = True and agent_id = False

# Run the GNN with no communication of capabilities, but the agent's action network is conditioned on capabilities (i.e. CA (GNN))
# set capability_aware = True and agent_id = False

# Run the GNN with agent ID (i.e. ID (GNN))
# set capability_aware = False and agent_id = True

# Run the MLP with capability-aware agents (i.e. CA (MLP))
# set capability_aware = True and agent_id = False

# Run the MLP with agent IDs (i.e. ID (MLP))
# set capability_aware = False and agent_id = True

Saving the Models

Each model will save 3 seeds (although extra seeds can be added). Checkpoint models and training metrics will be saved in the /results directory.

Heterogeneous Sensor Network Environment (HSN)

Citing MARBLER and EPyMarl

The experiments relied on the MARBLER-CA codebase, which is a fork of the original MARBLER repository. This fork introduced the heterogeneous sensor network environment with capability aware robots. The MARBLER framework is presented in MARBLER: An Open Platform for Standardized Evaluation of Multi-Robot Reinforcement Learning Algorithms.

Reza Torbati, Shubham Lohiya, Shivika Singh, Meher Shashwat Nigam, Harish Ravichandar. MARBLER: An Open Platform for Standardized Evaluation of Multi-Robot Reinforcement Learning Algorithms, 2023

      title={MARBLER: An Open Platform for Standardized Evaluation of Multi-Robot Reinforcement Learning Algorithms}, 
      author={Reza Torbati and Shubham Lohiya and Shivika Singh and Meher Shashwat Nigam and Harish Ravichandar},

The Extended PyMARL (EPyMARL) codebase was used in Benchmarking Multi-Agent Deep Reinforcement Learning Algorithms in Cooperative Tasks.

Georgios Papoudakis, Filippos Christianos, Lukas Schäfer, & Stefano V. Albrecht. Benchmarking Multi-Agent Deep Reinforcement Learning Algorithms in Cooperative Tasks, Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks (NeurIPS), 2021

All the source code that has been taken from the PyMARL repository was licensed (and remains so) under the Apache License v2.0 (included in LICENSE file). Any new code is also licensed under the Apache License v2.0