makarandtapaswi / Real2Sim_CoRL2020

Repository containing code for CoRL 2020 paper on "Learning Object Manipulation Skills via Approximate State Estimation from Real Videos"
15 stars 6 forks source link

About

This repository contains code for the CoRL 2020 submission titled Learning Object Manipulation Skills via Approximate State Estimation from Real Videos. In case of any question contact us at vladimir.petrik@cvut.cz or makarand.tapaswi@inria.fr.

Additional data: arXiv, Project page, YouTube overview video, Paper PDF

Citation:

@inproceedings{petrik2020real2sim,
author = {Vladimir Petrik and Makarand Tapaswi and Ivan Laptev and Josef Sivic},
title = {{Learning Object Manipulation Skills via Approximate State Estimation from Real Videos}},
booktitle = {Conference on Robot Learning (CoRL)},
year = {2020}
}

Code

The code is divided into a few main parts:

First part has its own installation instruction (separate conda env.), requires GPU, and uses neural renderer. RL and benchmarking parts share the same conda environment and use rlpyt for RL and PyPhysX for simulation. To simplify the experimentation, we provide you computed data for each step, so if you are interesting in RL part only you can download the extracted states as described below.

Estimating states from video

We use 6 videos each for 9 actions from the Something-Something-v2 dataset. Please download the data from the original link above. Video and action ids are indicated in the code repositories.

Structure

Installation

Required data

We also share video segmentation masks required as part of the perceptual losses from the neural renderer (results can be replicated without original videos). Extract this tarball in the real2sim/sthsth folder.

Optimization

Please look at real2sim/final_scripts.sh for a list of commands to replicate our state estimation results.

Reinforcement learning of control policies

Installation

conda env create -f install_real2sim_rl.yml
conda activate real2sim_rl

export PYTHONPATH="${PYTHONPATH}:`pwd`"  #run in the repo root

Required data

Either run states estimation from the previous section, or download and extract states from here into the folder data/states/. Once this is done, you can visualize states using:

python simulation/scripts/visualize_states.py data/states/1sA

which will open 3D viewer and shows states one by one. No physics simulation is performed in this visualization. Example of visualization:

Training the policy

The following command will train policy for action id 86 (pull left to right) on states 1sA. On standard 4 cores laptop without gpu it takes about 3 hours to finish. You can track learning progress in tensorboard.

# add -cuda_id 0 if you have cuda installed
python policy_learning/manipulation_learning.py -log_dir data/policies -name exp_1sA_act_86 --linear_lr -states_folder data/states/1sA -angle_bound_scale 0.01 --without_object_obs -seed 0

See folder policy_learning/scripts/ for a collection of bash scripts we used for training policies which performance was reported in the paper. These bash scripts just call manipulation_learning.py script with different parameters to train policies e.g. with different domain randomization or on different states.

Visualization of trained policy

Use policy trained in the previous step.

python policy_learning/manipulation_learning.py -log_dir data/policies -name exp_1sA_act_86 -states_folder data/states/1sA --without_object_obs --greedy_eval --render --realtime

You should see:

Benchmarking

For benchmarking, install and activate conda environment from RL section:

conda activate real2sim_rl

Benchmarking of policies is divided into two steps: (i) extracting trajectories by applying policy in simulation and (ii) evaluating trajectories by our proposed metrics. To extract trajectories for trained policy do:

python benchmarking/extract_benchmark_trajectories.py data/benchmark_specification/benchmark_easy.csv data/benchmark/exp_1sA_act_86 -log_dir data/policies -name exp_1sA_act_86 --without_object_obs

You will see progress bar that indicate evaluation of 1000 samples from easy benchmark specification. Trajectories are stored in csv files in benchmark/exp_1sA_act_86.

These trajectories are then analysed in the second step by invoking command:

python benchmarking/trajectories_analysis.py data/benchmark_specification/benchmark_easy.csv data/benchmark/exp_1sA_act_86

Besides success rate stored in data/benchmark/performance.txt, the script generates two plots useful for analysis, showing the marginal success rate histogram: