UMI on Legs: Making Manipulation Policies Mobile with Manipulation-Centric Whole-body Controllers

[Huy Ha](https://www.cs.columbia.edu/~huy/)$^{🐶,1,2}$, [Yihuai Gao](https://yihuai-gao.github.io/)$^{🐶,1}$ [Zipeng Fu](https://zipengfu.github.io/)$^1$, [Jie Tan](https://www.jie-tan.net/)$^{3}$ [Shuran Song](https://shurans.github.io/)$^{1,2}$ $^1$ Stanford University, $^2$ Columbia University, $^3$ Google DeepMind, $^🐶$ Equal Contribution [Project Page](https://umi-on-legs.github.io/) | [Arxiv](https://arxiv.org/abs/2407.10353) | [Video](https://www.youtube.com/watch?v=4Bp0q3xHTxE)

UMI on Legs is a framework for combining real-world human demonstrations with simulation trained whole-body controllers, providing a scalable approach for manipulation skills on robot dogs with arms. The best part? You can plug-and-play your existing visuomotor policies onto a quadruped, making your manipulation policies mobile!

This repository includes source code for whole-body controller simulation training, whole-body controller real-world deployment, iPhone odometry iOS application, UMI real-world environment class, and ARX5 SDK. We've published our code in a similar fashion to how we've developed it - as separate submodules - with the hope that the community can easily take any component they find useful out and plug it into their own system.

If you find this codebase useful, consider citing:

@inproceedings{ha2024umionlegs,
      title={{UMI} on Legs: Making Manipulation Policies Mobile with Manipulation-Centric Whole-body Controllers}, 
      author={Huy Ha and Yihuai Gao and Zipeng Fu and Jie Tan and Shuran Song},
      year={2024},
}

If you have any questions, please contact Huy Ha at huyha [at] stanford [dot] edu or Yihuai Gao at yihuai [at] stanford [dot] edu.

Table of Contents

If you just want to start running some commands while skimming the paper, you should get started here, which downloads data, checkpoints, and rolls out the WBC. The rest of the documentation is focused on setting up real world deployment.

🏃‍♀️ Getting Started
- ⚙️ Setup
- 📍 Checkpoint & Data
- 🕹️ Rollout
- 📊 Evaluation
- 📈 Curves
🦾 Universal Manipulation Interface
- 📷 Data Collection
- 🛠️ Hardware Guide
- 🎛️ Preprocessing
⚙️ Manipulation-Centric Whole-body Controller
- 🚂 Train
- 🛡️ Robustifying Sim2Real
- 🔭 Extending
  - 🤖 More Robots
  - 🫳 More Manipulation Trajectories
🌍 Real World Deployment
- 🤔 Reflections on Hardware Choices
- 📝 Bill of Materials
- 🦾 ARX5 Robot Arm SDK
- 📱 iPhone Odometry
- 🖨️ 3D Printing Guide
- 🛠️ Assembly Guide
- 🛜 Unitree Robots Network Setup
- 🐕 Deploy WBC on Real Robots
📽️ Visualizations

Code Acknowledgements

Whole-body Controller Simulation Training:

Like many other RL for control works nowadays, we started with Nikita Rudin's implementation of PPO and Gym environment wrapper around IsaacGym, legged gym. Shout out to Nikita for publishing such a hackable codebase - it's truly an amazing contribution to our community.
Although not used in the final results of the paper, our codebase does include a modified Perlin Noise Terrain from DeepWBC. To use it, run training with env.cfg.terrain.mode=perlin.

Whole-body Controller Deployment:

Thanks to Qi Wu for providing us with an initial deployment script for the whole-body controller!

iPhone Odometry Application:

Thanks to Zhenjia Xu for providing us with some starter code for ARKit camera pose publishing!

UMI Environment Class:

Our UMI deployment codebase heavily builds upon the original UMI codebase. Big thanks to the UMI team!

OptiTrack Motion Capture Setup:

Thanks to Jingyun Yang and Zi-ang Cao for providing the OptiTrack motion capture code and helping us to set it up!

real-stanford / umi-on-legs

readme

UMI on Legs: Making Manipulation Policies Mobile with Manipulation-Centric Whole-body Controllers

Code Acknowledgements