Code for training imitation learning agents for Objectnav and Pick-and-Place in Habitat. This repo is the official code repository for the paper Habitat-Web: Learning Embodied Object-Search from Human Demonstrations at Scale
We provide best checkpoints for agents trained on ObjectNav and Pick-and-Place. You can use the following checkpoints to reproduce results reported in our paper.
Task | Split | Checkpoint | Success Rate | SPL |
---|---|---|---|---|
πObjectNav | v1 | objectnav_semseg.ckpt | 27.8 | 9.9 |
π[Pick-and-Place]() | New Initializations | pick_place_rgbd_new_inits.ckpt | 17.5 | 9.8 |
π[Pick-and-Place]() | New Instructions | pick_place_rgbd_new_insts.ckpt | 15.1 | 8.3 |
π[Pick-and-Place]() | New Environments | pick_place_rgbd_new_envs.ckpt | 8.3 | 4.1 |
You can find the pretrained RedNet semantic segmentation model weights here and the pretrained depth encoder weights here.
The primary code contributions from the paper are located in:
Imitation Learning Baselines:
habitat_baselines/il/env_based/
habitat_baselines/il/disk_based/
Experiment Configurations:
habitat_baselines/config/objectnav/*.yaml
habitat_baselines/config/pickplace/*.yaml
Replay Scripts:
examples/objectnav_replay.py
examples/pickplace_replay.py
Clone the repository and install habitat-web-baselines
using the commands below. Note that python=3.6
is required for working with habitat-web-baselines
. All the development was done on habitat-lab=0.1.6
.
git clone https://github.com/Ram81/habitat-web-baselines.git
cd habitat-web-baselines
# We require python>=3.6 and cmake>=3.10
conda create -n habitat-web python=3.6 cmake=3.14.0
conda activate habitat-web
pip install -e .
python setup.py develop --all
Install our custom build of habitat-sim
, we highly recommend using the habitat-sim
build from source for working with habitat-web-baselines
. Use the following commands to set it up:
git clone git@github.com:Ram81/habitat-sim.git
cd habitat-sim
Install dependencies
Common
pip install -r requirements.txt
Linux (Tested with Ubuntu 18.04 with gcc 7.4.0)
sudo apt-get update || true
# These are fairly ubiquitous packages and your system likely has them already,
# but if not, let's get the essentials for EGL support:
sudo apt-get install -y --no-install-recommends \
libjpeg-dev libglm-dev libgl1-mesa-glx libegl1-mesa-dev mesa-utils xorg-dev freeglut3-dev
See this configuration for a full list of dependencies that our CI installs on a clean Ubuntu VM. If you run into build errors later, this is a good place to check if all dependencies are installed.
Build Habitat-Sim
Default build with bullet (for machines with a display attached)
# Assuming we're still within habitat conda environment
./build.sh --bullet
For headless systems (i.e. without an attached display, e.g. in a cluster) and multiple GPU systems
./build.sh --headless --bullet
For use with habitat-web-baselines and your own python code, add habitat-sim to your PYTHONPATH
. For example modify your .bashrc
(or .bash_profile
in Mac OS X) file by adding the line:
export PYTHONPATH=$PYTHONPATH:/path/to/habitat-sim/
Download the MP3D dataset using the instructions here: https://github.com/facebookresearch/habitat-lab#scenes-datasets (download the full MP3D dataset for use with habitat)
Move the MP3D scene dataset or create a symlink at data/scene_datasets/mp3d
.
Download the object assets used for Pick-and-Place task and THDA ObjectNav episodes from here.
Unzip the object assets and verify they are stored at data/test_assets/objects/
.
You can use the following datasets to reproduce results reported in our paper.
Dataset | Scene dataset | Split | Link | Extract path |
---|---|---|---|---|
ObjectNav-HD | MP3D | 70k | objectnav_mp3d_70k.json.gz | data/datasets/objectnav/objectnav_mp3d_70k/ |
ObjectNav-HD | MP3D+Gibson | Full | objectnav_mp3d_gibson_80k.json.gz | data/datasets/objectnav/objectnav_mp3d_gibson_80k/ |
Pick-and-Place-HD | MP3D | Full | pick_place_12k.json.gz | data/datasets/pick_place/pick_place_12k/ |
Pick-and-Place-HD | MP3D | New Initializations | pick_place_unseen_initializations.json.gz | data/datasets/pick_place/unseen_initializations/ |
Pick-and-Place-HD | MP3D | New Instructions | pick_place_unseen_instructions.json.gz | data/datasets/pick_place/unseen_instructions/ |
Pick-and-Place-HD | MP3D | New Environments | pick_place_unseen_scenes.json.gz | data/datasets/pick_place/unseen_scenes/ |
The demonstration datasets released as part of this project are licensed under a Creative Commons Attribution-NonCommercial 4.0 License.
The code requires the datasets in data
folder in the following format:
βββ habitat-web-baselines/
β βββ data
β β βββ scene_datasets/
β β β βββ mp3d/
β β β β βββ JeFG25nYj2p.glb
β β β β βββ JeFG25nYj2p.navmesh
β β βββ datasets
β β β βββ objectnav/
β β β β βββ objectnav_mp3d_70k/
β β β β β βββ train/
β β β βββ pick_place/
β β β β βββ pick_place_12k/
β β β β β βββ train/
β β βββ test_assets/
β β β βββ objects/
β β β β βββ apple.glb
β β β β βββ plate.glb
We also provide an example of packaging your own demonstrations dataset to train imitation learning agents with habitat-imitation-baselines
here.
To verify that the data is set up correctly, run:
python examples/objectnav_replay.py --path data/datasets/objectnav/objectnav_mp3d_70k/sample/sample.json.gz
For training the behavior cloning policy on the ObjectGoal Navigation task using the environment based setup:
Use the following script for multi-node training
sbatch job_scripts/run_objectnav_training.sh habitat_baselines/config/objectnav/il_ddp_objectnav.yaml
To run training on a single node use:
sbatch job_scripts/run_objectnav_training.sh habitat_baselines/config/objectnav/il_objectnav.yaml
For training the behavior cloning policy on the Pick-and-Place task using the disk based setup:
Use the following script for multi-node training
sbatch job_scripts/run_pickplace_training.sh ddp
To run training on a single node use:
sbatch job_scripts/run_pickplace_training.sh single_node
To evaluate pretrained checkpoint on ObjectGoal Navigation, download the objectnav_mp3d_v1
dataset from here.
For evaluating a checkpoint on the ObjectGoal Navigation task using the environment based setup:
Use the following script if trained using distributed setup
sbatch job_scripts/run_objectnav_eval.sh habitat_baselines/config/objectnav/il_ddp_objectnav.yaml data/datasets/objectnav_mp3d_v1 /path/to/checkpoint
Use the following script for evaluating single node checkpoint
sbatch job_scripts/run_objectnav_eval.sh habitat_baselines/config/objectnav/il_objectnav.yaml data/datasets/objectnav_mp3d_v1 /path/to/checkpoint
For evaluating the behavior cloning policy on the Pick-and-Place task using the disk based setup:
Use the following script if trained using dristributed setup
sbatch job_scripts/run_pickplace_eval.sh ddp
Use the following script for evaluating single node checkpoint
sbatch job_scripts/run_pickplace_eval.sh single_node
If you use this code in your research, please consider citing:
@inproceedings{ramrakhya2022,
title={Habitat-Web: Learning Embodied Object-Search Strategies from Human Demonstrations at Scale},
author={Ram Ramrakhya and Eric Undersander and Dhruv Batra and Abhishek Das},
year={2022},
booktitle={{Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
}