Goal-conditioned Imitation Learning By: Yiming Ding, Carlos Florensa, Mariano Phielipp, Pieter Abbeel

Link: https://arxiv.org/pdf/1906.05838.pdf Code: https://sites.google.com/view/goalconditioned-il/ replication: https://openreview.net/forum?id=HJlCUp5M6H&noteId=VPGTimygxA

Comment: Publised on Aug.23, 2019. This is an improvement over HER and GAIL(Generative Adversarial Imitation Learning).

Problem:

Designing rewards for Reinforcement Learning (RL) is challenging because it needs to convey the desired task, be efficient to optimize, and be easy to compute. The latter is particularly problematic when applying RL to robotics, where detecting whether the desired configuration is reached might require considerable supervision and instrumentation

Innovation: In this work we investigate different approaches to incorporate demonstrations to drastically speed up the convergence to a policy able to reach any goal, also surpassing the performance of an agent trained with other Imitation Learning algorithms.

Most of the previous work on IL is centered around trajectory following, or doing a single task. Furthermore it is limited by the performance of the demonstrations, or relies on engineered rewards to improve upon them. In this work we first illustrate how IL methods can be extended to the goal-conditioned setting, and study a more powerful relabeling strategy that extracts additional information from the demonstrations. We then propose a novel algorithm, goalGAIL, and show it can outperform the demonstrator without the need of any additional reward.

Key Techs: 4.1 Goal-conditioned Behavioral Cloning 4.2 Relabeling the expert 4.3 Goal-conditioned GAIL with Hindsight 5.2 Goal-conditioned GAIL with Hindsight: goalGAIL

QiXuanWang / LearningFromTheBest

Goal-conditioned Imitation Learning By: Yiming Ding, Carlos Florensa, Mariano Phielipp, Pieter Abbeel #36