Summary

What is this

Proposes DARLA (DisentAngled Representation Learning Agent)
Problems/Motivation
- Learning good internal representations with both source and target domain data
  - The reliance on target domain information can be problematic, as the data may be expensive or difficult to obtain
- Learning exclusively on the source domain using deep RL appraoch
  - Poor domain adaptation performance
- DARLA tackle both issues by focusing on learning underlying low-dimensional factorised representation of the world
Demonstrate how disentangled representations can improve the robustness of RL algorithms in domain adaptation scenarios
- The theoretical utility of disentangled representations for reinforcement learning has been described before, but it has not been empirically validated
RL algorithms
- DQN
- A3C
- Model-Free Episodic Control

Consists of a three stage pipeline
1. learning to see
2. learning to act
3. transfer
replaces the reconstruction loss in the VAE objective as follows
- J is a denoising autoencoder
"the disentangled model used for DARLA was trained with a β hyperparameter value of 1"
- "Note that by replacing the pixel based reconstruction loss in Eq. 1 with high-level feature recon- struction loss in Eq. 2 we are no longer optimising the vari- ational lower bound, and β-VAEDAE with β = 1 loses its equivalence to the Variational Autoencoder (VAE) frame- work as proposed by (Kingma & Welling, 2014; Rezende et al., 2014)."

Experiments
- DeepMind Lab
- Jaco robotic arm (including sim2real set-up: Mujoco simulation is the source domain and the real robotic arm is the target domain)