liruiw / HPT

Heterogeneous Pre-trained Transformer (HPT) as Scalable Policy Learner.
https://liruiw.github.io/hpt
MIT License
406 stars 21 forks source link
foundation-models heterogeneity hpt policy robotics

🦾 Heterogenous Pre-trained Transformers

HF Models License Paper Website [Python]() [PyTorch]()

Lirui Wang, Xinlei Chen, Jialiang Zhao, Kaiming He

Neural Information Processing Systems (Spotlight), 2024


This is a pytorch implementation for pre-training Heterogenous Pre-trained Transformers (HPTs). The pre-training procedure train on mixture of embodiment datasets with a supervised learning objective. The pre-training process can take some time, so we also provide pre-trained checkpoints below. You can find more details on our project page. An alternative clean implementation of HPT in Hugging Face can also be found here.

TL;DR: HPT aligns different embodiment to a shared latent space and investigates the scaling behaviors in policy learning. Put a scalable transformer in the middle of your policy and don’t train from scratch!

βš™οΈ Setup

  1. pip install -e .
Install (old-version) Mujoco ``` mkdir ~/.mujoco cd ~/.mujoco wget https://mujoco.org/download/mujoco210-linux-x86_64.tar.gz -O mujoco210.tar.gz --no-check-certificate tar -xvzf mujoco210.tar.gz # add the following line to ~/.bashrc if needed export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:${HOME}/.mujoco/mujoco210/bin export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/lib/nvidia export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda/lib64 export MUJOCO_GL=egl ```

🚢 Usage

  1. Check out quickstart.ipynb for how to use the pretrained HPTs.
  2. python -m hpt.run train policies on each environment. Add +mode=debug for debugging.
  3. bash experiments/scripts/metaworld/train_test_metaworld_1task.sh test test 1 +mode=debug for example script.
  4. Change train.pretrained_dir for loading pre-trained trunk transformer. The model can be loaded either from local checkpoint folder or huggingface repository.
  5. Run the following scripts for mujoco experiments.
Metaworld 20 Task Experiments ``` bash experiments/scripts/metaworld/train_test_metaworld_20task_finetune.sh hf://liruiw/hpt-base ``` 5. See [here](experiments/config/config.yaml) for defining and modifying the hyperparameters. 6. We use [wandb](https://wandb.ai/home) to log the training process.

πŸ€– Try this On Your Own Dataset

  1. For training, it requires a dataset conversion convert_dataset function for packing your own datasets. Check this for example.
  2. For evaluation, it requires a rollout_runner.py file for each benchmark and a learner_trajectory_generator evaluation function that provides rollouts.
  3. If needed, modify the config for changing the perception stem networks and action head networks in the models. Take a look at realrobot_image.yaml for example script in the real world.
  4. Add dataset.use_disk=True for saving and loading the dataset in disk.

πŸ’½ Checkpoints

You can find pretrained HPT checkpoints here. At the moment we provide the following model versions:

Model Size
HPT-XLarge 226.8M Params
HPT-Large 50.5M Params
HPT-Base 12.6M Params
HPT-Small 3.1M Params
HPT-Base (With Language) 50.6M Params

πŸ’Ύ File Structure

β”œβ”€β”€ ...
β”œβ”€β”€ HPT
|   β”œβ”€β”€ data            # cached datasets
|   β”œβ”€β”€ output          # trained models and figures
|   β”œβ”€β”€ env             # environment wrappers
|   β”œβ”€β”€ hpt             # model training and dataset source code
|   |   β”œβ”€β”€ models      # network models
|   |   β”œβ”€β”€ datasets    # dataset related
|   |   β”œβ”€β”€ run         # transfer learning main loop
|   |   β”œβ”€β”€ run_eval    # evaluation main loop
|   |   └── ...
|   β”œβ”€β”€ experiments     # training configs
|   |   β”œβ”€β”€ configs     # modular configs
└── ...

πŸ•ΉοΈ Citation

If you find HPT useful in your research, please consider citing:

@inproceedings{wang2024hpt,
author    = {Lirui Wang, Xinlei Chen, Jialiang Zhao, Kaiming He},
title     = {Scaling Proprioceptive-Visual Learning with Heterogeneous Pre-trained Transformers},
booktitle = {Neurips},
year      = {2024}
}

Contact

If you have any questions, feel free to contact me through email (liruiw@mit.edu). Enjoy!