j96w / MimicPlay

"MimicPlay: Long-Horizon Imitation Learning by Watching Human Play" code repository
MIT License
214 stars 23 forks source link

MimicPlay


Table of Content


Overview

This repository is the implementation code of the CoRL 2023 Oral paper "MimicPlay: Long-Horizon Imitation Learning by Watching Human Play"(arXiv, Project, Video) by Wang et al. at Nvidia Research and Stanford Vision and Learning Lab.

In this repo, we provide our full implementation code of training and evaluation in simulation and the scripts to generate human play dataset for experiments in the real-world. Note that, in our main paper, we leverage human play data. However, in simulation, there is no way to get such dataset, which will always end up be robot teleoperation. Therefore, in this repo, we use the same robot play dataset to train both high-level planner and low-level policy, and reproduce the advantage of MimicPlay(0-human) over baselines in simulation. For real-world experiments, we provide the process scripts for human play video, which generates the dataset that can directly used for training the high-level latent planner of this repo.


Installation

Create and activate conda environment

conda create -n mimicplay python=3.8
conda activate mimicplay

The simulation result of MimicPlay is tested on LIBERO, which is a great multitask manipulation simulation based on robosuite and latest MuJoCo. We choose LIBERO due to its utilization of the BDDL language for goal specification, which facilitates the multitask evaluation for learning from play data.

# Install MuJoCo
pip install mujoco

# Install robosuite
git clone https://github.com/ARISE-Initiative/robosuite.git
cd robosuite
git checkout v1.4.1_libero
pip install -r requirements.txt
pip install -r requirements-extra.txt
pip install -e .

# Install BDDL
cd ..
git clone https://github.com/StanfordVL/bddl.git
cd bddl
pip install -e .

# Install LIBERO
cd ..
git clone https://github.com/Lifelong-Robot-Learning/LIBERO.git
cd LIBERO
pip install -r requirements.txt
pip install -e .

MimicPlay is based on robomimic, which facilitates the basics of learning from offline demonstrations.

cd ..
git clone https://github.com/ARISE-Initiative/robomimic
cd robomimic
git checkout mimicplay-libero
pip install -e .

Install MimicPlay

cd ..
git clone https://github.com/j96w/MimicPlay.git
cd MimicPlay
pip install -e .

Dataset

You can download the collected play data (for training) and task video prompts (for multitask evaluation) form Link. The play data is a set of demonstrations without a specific task goal (unlabeled). We recommend downloading the raw data (demo.hdf5) and process it to the training dataset with image observation (demo_image.hdf5) on local machine, since it is a good way to check whether the environment libraries are installed correctly. To process the raw data to the dataset with visual observation, please follow the steps:


Training

MimicPlay is a hirarical arigorithm for learning from play data (no-cut, unlabeled demonstrations), which consists two training stage - (1). Learning goal-conditioned high-level latent planner. (2). Learning plan-guided low-level robot controller.


Baseline Training

To run a BC-transformer baseline with the same model size and archtecture. Simply

python scripts/train.py --config configs/BC_trans_scratch.json --dataset 'datasets/playdata/image_demo_local.hdf5' --bddl_file 'scripts/bddl_files/KITCHEN_SCENE9_eval-task-1_turn_on_stove_put_pan_on_stove_put_bowl_on_shelf.bddl' --video_prompt 'datasets/eval-task-1_turn_on_stove_put_pan_on_stove_put_bowl_on_shelf/image_demo.hdf5'

The only difference between this baseline and MimicPlay is it does not have a high-level latent planner and train an end-to-end low-level controller directly with goal image inputs (Same as Play-LMP).

To run a BC-RNN-GMM baseline (robomimic) with play data. Simply

python scripts/train.py --config configs/BC_RNN_scratch.json --dataset 'datasets/playdata/image_demo_local.hdf5' --bddl_file 'scripts/bddl_files/KITCHEN_SCENE9_eval-task-1_turn_on_stove_put_pan_on_stove_put_bowl_on_shelf.bddl' --video_prompt 'datasets/eval-task-1_turn_on_stove_put_pan_on_stove_put_bowl_on_shelf/image_demo.hdf5'

Evaluation

Evaluation on all tasks (link the testing policy model in run_trained_agent_multitask.sh):

./scripts/run_trained_agent_multitask.sh

To evaluate on a specific task, simply

python scripts/run_trained_agent.py --agent 'path_to_trained_low-level_policy' --bddl_file 'path_to_task_bddl_file' --video_prompt 'path_to_task_video_prompt' --video_path 'eval_rollouts.mp4'

For example, you can use scripts/bddl_files/KITCHEN_SCENE9_eval-task-1_turn_on_stove_put_pan_on_stove_put_bowl_on_shelf.bddl and datasets/eval-task-1_turn_on_stove_put_pan_on_stove_put_bowl_on_shelf/image_demo.hdf5 as the bddl_file and video_prompt.


Trained Checkpoints

You can download our trained checkpoints from Link. The comparsion results between MimicPlay(0-human) and baselines in simulation could be found in our paper appendix.


Human play data processing

In the real world experiments, MimicPlay leverages human play data. The following example will guide you to generate a hdf5 dataset file from two MP4 video files (dual camera views for 3D hand trajectory). The steps are:


Citations

Please cite MimicPlay if you find this repository helpful:

@article{wang2023mimicplay,
  title={Mimicplay: Long-horizon imitation learning by watching human play},
  author={Wang, Chen and Fan, Linxi and Sun, Jiankai and Zhang, Ruohan and Fei-Fei, Li and Xu, Danfei and Zhu, Yuke and Anandkumar, Anima},
  journal={arXiv preprint arXiv:2302.12422},
  year={2023}
}

License

Licensed under the MIT License