In this work, we conduct the first systematic study on the efficacy of 3D visual RL from point clouds, and compare it with well-established RL from 2D RGB/RGB-D representations. Through extensive experiments, we demonstrate that 3D point cloud representations are particularly beneficial on tasks where agent-object/object-object spatial relationship reasoning plays a crucial role, and achieves better sample complexity and performance than 2D image-based agents. Moreover, we carefully investigate the design choices for 3D point cloud RL agents from perspectives such as network inductive bias, representational robustness, data augmentation, and data post-processing. We hope that our study provides insights, inspiration, and guidance for future works on 3D visual RL.
First, install Mujoco by downloading https://github.com/deepmind/mujoco/releases/download/2.1.0/mujoco210-linux-x86_64.tar.gz
(or https://github.com/deepmind/mujoco/releases/download/2.1.0/mujoco210-macos-x86_64.tar.gz
for MacOS). After extracting the tar file, put the mujoco210
dirctory inside ~/.mujoco
directory (create the directory if not exist). Then, ensure that the following line is in your ~/.bashrc
file (or ~/.zshrc
if you use zsh):
export LD_LIBRARY_PATH=/home/{USERNAME}/.mujoco/mujoco210/bin:$LD_LIBRARY_PATH
Then, install dependencies using the following procedure:
git clone https://github.com/lz1oceani/pointcloud_rl.git
cd pointcloud_rl
conda env create -f environment.yml
conda activate efficacy_pointcloud_rl
cd mani_skill/
pip install -e . # installs ManiSkill
pip install sapien #install sapien
cd .. # back to pointcloud_rl
pip install -e .
pip install torch==1.13.1+cu116 torchvision==0.14.1+cu116 --extra-index-url https://download.pytorch.org/whl/cu116
pip install -r requirements.txt # installs DM Control and pyrl learning repo
pip uninstall torchsparse
sudo apt-get install libsparsehash-dev # brew install google-sparsehash if you use Mac OS
# before installing torchsparse, make sure the nvcc version is the same as the cuda version used when installing pytorch
pip install git+https://github.com/lz1oceani/torchsparse.git
For the pytorch installation, you can change the cudatoolkit version to comply with the CUDA version of your machine.
We used random seeds 1000, 2000, 3000 in our experiments. This can be altered using the --seed
flag.
Usage:
python ./pyrl/apis/run_rl.py config_file_path \
--work-dir {work_directory_path} \
[--seed {seed}] \
[--gpu-ids {gpu_ids...} | --num-gpus {number_of_gpus}] \
[--cfg-options ["env_cfg.env_name={environment_name}"] ["env_cfg.obs_mode={observation_mode}"]]
Explanations:
/pyrl/apis/run_rl.py
is the driver module that starts the training. The first argument to this module is the configuration file path.
Set config_file_path
to one of the configureation files in the configs/mfrl
directory.
use --work-dir
to change the desired save location of the experiment.
--seed
takes a single integer for seed. If left blank, a random seed will be used.
If using --gpu-ids
, input any number of existing CUDA gpu ids. If using --num-gpus
, input a single integer for the number of gpus to be used. To reproduce our DM Control experiments, we used --gpu-ids 0
. To reproduce our ManiSkill experiments, we used --gpu-ids 0 1
(or --num-gpus 2
).
The --cfg-options
is used to override part of the configuration files. It is an optional argument and is followed by any number of modifications to the configuration file. Each modification needs to follow the format "{arg}={option}"
(the quotation marks here are necessary). The two most important modifications are "env_cfg.env_name"
and "env_cfg.obs_mode"
.
Replace env_name
with one of the environment from DM control or ManiSkill. Note that the config_file_path
need to be from the same domain of environments. For example, "env_cfg.env_name=OpenCabinetDoor_train-v0"
.
Replace obs_mode
with either rgb
, rgbd
or pointcloud
. For example, "env_cfg.obs_mode=pointcloud"
. It is not necessary to explicitly set the observation mode if the configuration file's network type is pn
or sparse_conv
. The cnn
configuration files can be run under both rgb
and rgbd
mode, and the default is rgb
mode.
You can also manually modify the env_name
component or the obs_mode
component of env_cfg
inside the configuration files to accomplish the same outcome. However, using --cfg-options
adds flexibility.
As an example, to run DrQ with Jitter augmentation on ManiSkill's MoveBucket_4000-v0
Environment:
python ./pyrl/apis/run_rl.py ./configs/mfrl/drq/maniskill/pn_jitter.py \
--work-dir=/path/to/workdir/ --seed 1000 --num-gpus 2 --cfg-options "env_cfg.env_name=MoveBucket_4000-v0"
Another example of running SAC with rgbd input on the Walker Walk task of DM Control:
python ./pyrl/apis/run_rl.py ./configs/mfrl/sac/dm_control/cnn.py \
--work-dir=/path/to/workdir/ --seed 1000 --gpu-ids 0 \
--cfg-options "env_cfg.env_name=dmc_walker_walk-v0" "env_cfg.obs_mode=rgbd"
# if you encounter errors when running DM Control environments in headless mode, add "MUJOCO_GL=egl" before the command.
For the motivating example, to run DrQ with rgbd observation mode:
python ./pyrl/apis/run_rl.py ./configs/mfrl/drq/dm_control/cnn_shift_motivating.py \
--work-dir=/path/to/workdir/ --seed 1000 --gpu-ids 0 --cfg-options "env_cfg.obs_mode=rgbd"
# Since there is only one environment for our motivating example, we do not need to change the environment_name.
If you encounter errors when running DM Control environments in headless mode, try the following:
export MUJOCO_GL=egl python {command} #or MUJOCO_GL=osmesa
DM Control environment names follow this format: dmc_{domain_name}_{task_name}-v0
. We support all DM Control environments. Specifically,
ManiSkill environment names follow this format: {task_name}_[{obj_number} | train]-v0
. The environments used in our experiments are:
Location of algorithm implementations:
./pyrl_code_release_v1/pyrl/methods/mfrl/sac.py
./pyrl_code_release_v1/pyrl/methods/mfrl/drq.py
Location of network implementations:
./pyrl_code_release_v1/pyrl/networks/backbones/cnn.py
./pyrl_code_release_v1/pyrl/networks/backbones/pointnet.py
./pyrl_code_release_v1/pyrl/networks/backbones/sp_resnet.py
Config files are located in the configs/mfrl
directory, and are divided by algorithms (sac, drq) and environment domains (dmc or maniskill). Names of config file follow the format network_type[_augmentation].py
. The network_type
can be cnn
for CNN, pn
for PointNet, and sparse_conv
for SparseConvNet. For DrQ configuration files, the [augmentation]
can be shift
, rot
, jitter
, colorjitter
, ordropout
. PointNet can use all augmentations, while CNN can only use the shift
augmentation.
If a configuration file contains the suffix motivating
, they are for the motivating example we provided in our experiments and does not belong to the DM Control environments.
For example, the configuration file for running SAC on the walker_walk
environment with rgb or rgbd observation mode is ./configs/mfrl/sac/dm_control/cnn.py
. The configuration file for running DrQ on the MoveBucket_4000-v0 environment with point cloud observation mode and jitter augmentation is ./configs/mfrl/drq/maniskill/pn_jitter.py
.
Please cite our paper if you find our idea helpful. Thanks a lot!
@article{ling2023efficacy,
title={On the Efficacy of 3D Point Cloud Reinforcement Learning},
author={Ling, Zhan and Yao, Yunchao and Li, Xuanlin and Su, Hao},
journal={arXiv preprint arXiv:2306.06799},
year={2023}
}
This project is licensed under the Apache 2.0 license.