Project Page | arXiv |
---|---|
Data |
Yanjie Ze, Gu Zhang, Kangning Zhang, Chenyuan Hu, Muhan Wang, Huazhe Xu
Robotics: Science and Systems (RSS) 2024
3D Diffusion Policy (DP3) is a universal visual imitation learning algorithm that marries 3D visual representations with diffusion policies, achieving surprising effectiveness in diverse simulated and real-world tasks, including both high-dimensional and low-dimensional control tasks, with a practical inference speed.
Applications and extensions of DP3 from the community:
Simulation environments. We provide dexterous manipulation environments and expert policies for Adroit
, DexArt
, and MetaWorld
in this codebase (3+4+50=57 tasks in total). the 3D modality generation (depths and point clouds) has been incorporated for these environments.
Real-world robot data is also provided here.
Algorithms. We provide the implementation of the following algorithms:
dp3.yaml
simple_dp3.yaml
Among these, dp3.yaml
is the proposed algorithm in our paper, showing a significant improvement over the baselines. During training, DP3 takes ~10G gpu memory and ~3 hours on an Nvidia A40 gpu, thus it is feasible for most researchers.
simple_dp3.yaml
is a simplified version of DP3, which is much faster in training (1~2 hour) and inference (25 FPS) , without much performance loss, thus it is more recommended for robotics researchers.
See INSTALL.md for installation instructions.
See ERROR_CATCH.md for error catching I personally encountered during installation.
You could generate demonstrations by yourself using our provided expert policies. Generated demonstrations are under $YOUR_REPO_PATH/3D-Diffusion-Policy/data/
.
ckpts
folder under $YOUR_REPO_PATH/third_party/VRL3/
.assets
folder under $YOUR_REPO_PATH/third_party/dexart-release/
.Note: since you are generating demonstrations by yourselves, the results could be slightly different from the results reported in the paper. This is normal since the results of imitation learning highly depend on the demonstration quality. Please re-generate demonstrations if you encounter some bad demonstrations and no need to open a new issue.
Scripts for generating demonstrations, training, and evaluation are all provided in the scripts/
folder.
The results are logged by wandb
, so you need to wandb login
first to see the results and videos.
For more detailed arguments, please refer to the scripts and the code. We here provide a simple instruction for using the codebase.
Generate demonstrations by gen_demonstration_adroit.sh
and gen_demonstration_dexart.sh
. See the scripts for details. For example:
bash scripts/gen_demonstration_adroit.sh hammer
This will generate demonstrations for the hammer
task in Adroit environment. The data will be saved in 3D-Diffusion-Policy/data/
folder automatically.
Train and evaluate a policy with behavior cloning. For example:
bash scripts/train_policy.sh dp3 adroit_hammer 0112 0 0
This will train a DP3 policy on the hammer
task in Adroit environment using point cloud modality. By default we save the ckpt (optional in the script).
Evaluate a saved policy or use it for inference. Please set For example:
bash scripts/eval_policy.sh dp3 adroit_hammer 0112 0 0
This will evaluate the saved DP3 policy you just trained. Note: the evaluation script is only provided for deployment/inference. For benchmarking, please use the results logged in wandb during training.
Hardware Setup
Software
Every collected real robot demonstration (episode length: T) is a dictionary:
For training and evaluation, you should process the point clouds (cropping using a bounding box and FPS downsampling) as described in the paper. We also provide an example script (here).
You can try using our provided real world data to train the policy.
3D-Diffusion-Policy/data/
folder, e.g. 3D-Diffusion-Policy/data/realdex_drill.zarr
, please keep the path the same as 'zarr_path' in the task's yaml file.bash scripts/train_policy.sh dp3 realdex_drill 0112 0 0
We provide a simple visualizer to visualize point clouds for the convenience of debugging in headless machines. You could install it by
cd visualizer
pip install -e .
Then you could visualize point clouds by
import visualizer
your_pointcloud = ... # your point cloud data, numpy array with shape (N, 3) or (N, 6)
visualizer.visualize_pointcloud(your_pointcloud)
This will show the point cloud in a web browser.
The good part of DP3 is its universality, so that you could easily run DP3 on your own tasks. What you need to add is to make this codebase support the task in our format. Here are some simple steps:
Write the environment wrapper for your task. You need to write a wrapper for your environment, to make the environment interface easy to use. See 3D-Diffusion-Policy/diffusion_policy_3d/env/adroit
for an example.
Add the environment runner for your task. See 3D-Diffusion-Policy/diffusion_policy_3d/env_runner/
for examples.
Prepare expert data for your task. The script third_party/VRL3/src/gen_demonstration.py
is a good example of how to generate demonstrations in our format. Basically expert data is the state-action pairs saved in a sequence.
Add the dataset which loads your data. See 3D-Diffusion-Policy/diffusion_policy_3d/dataset/
for examples.
Add the config file in 3D-Diffusion-Policy/diffusion_policy_3d/configs/task
. There have been many examples in the folder.
Train and evaluate DP3 on your task. See 3D-Diffusion-Policy/scripts/train_policy.sh
for examples.
This repository is released under the MIT license. See LICENSE for additional details.
Our code is generally built upon: Diffusion Policy, DexMV, DexArt, VRL3, DAPG, DexDeform, RL3D, GNFactor, H-InDex, MetaWorld, BEE, Bi-DexHands, HORA. We thank all these authors for their nicely open sourced code and their great contributions to the community.
Contact Yanjie Ze if you have any questions or suggestions.
If you find our work useful, please consider citing:
@inproceedings{Ze2024DP3,
title={3D Diffusion Policy: Generalizable Visuomotor Policy Learning via Simple 3D Representations},
author={Yanjie Ze and Gu Zhang and Kangning Zhang and Chenyuan Hu and Muhan Wang and Huazhe Xu},
booktitle={Proceedings of Robotics: Science and Systems (RSS)},
year={2024}
}