Adversarially Regularized Autoencoders

Abstract

Auto-encoders work well for continuous representations like images, but it suffers in discrete representations like texts, because encoders are smooth in its nature
Propose ARAE with the goal of more robust discrete-space representation
- ARAE jointly trains both a rich discrete-space encoder, such as an RNN, and a simpler continuous space generator function, while using generative adversarial network (GAN) training to constrain the distributions to be similar
Using the latent variable model (ARAE-GAN), the model is able to generate varied sentences by moving around in the latent space via interpolation and offset vector arithmetic

Details

Related Works
- A major difficulty of discrete autoencoders is mapping a discrete structure to a continuous code vector while also smoothly capturing the complex local relationships of the input space
- VAEs usually learns identity mapping when the latent code space is free of any structure
- a popular approach is to regularize through an explicit prior on the code space and use a variational approximation to the posterior, leading to a family of models called variational autoencoders (VAE)
- Unfortunately VAEs for discrete text sequences can be challenging to train — if not carefully trained with techniques like word dropout and KL annealing, the decoder simply becomes a language model and ignores the latent code
- Possible reasons are the strictness of the prior (usually a spherical Gaussian) and/or the parameterization of the posterior
Background
- perfect and concise explanation of VAE and GAN
Adversarially Regularized Autoencoder
- Model
- a discrete autoencoder should be able to reconstruct x from c, but also smoothly assign similar codes c and c' to similar x and x'
- target the smoothness issue, we learn a parallel continuous-space generator with a restricted functional form to act as a smoother reference encoding
- The joint objective regularizes the autoencoder to constrain the discrete encoder to agree in distribution with its continuous counterpart
- W is the Wasserstein-1 distance
- Training
- use a block coordinate descent to alternate between optimizing different parts of the model:
  - (1) the encoder and decoder to minimize reconstruction loss
  - (2) the WGAN critic function to approximate the W term
  - (3) the encoder and generator to adversarially fool the critic to minimize W
- ARAE Training
Extension: Code Space Transfer
- when you learn the encoder with attribute value (in multilingual setting, the language), and learn GAN such that code space is invariant to this attribute value, we can make semantic encoder that maps same semantics in different languages into closer code space, and be able to decode into any language in decoder using attribute flag
Is ARAE smooth?
- compare with generator code
Is ARAE robust?
- ARAE has lower reconstruction error when swaps are higher
ARAE-GAN
- linear interpolation of sentences
- adding attributes to sentence

Personal Thoughts

Good explanation, comparison of VAE and GAN
Lots of experiments, this is the style of Yoon Kim :)
Using smooth generator to regularize encoder is creative, but did not fully understand how that works
Sample Interpolation is interesting, but definitely less interesting than images
- change in few words/tokens is not much interesting compared to visual changes in image
Style/Semantic Transfer has accuracy of 85%, which is too low for good transfer function

Link : https://arxiv.org/pdf/1706.04223.pdf Authors : Zhao et al. 2017

kweonwooj / papers

Adversarially Regularized Autoencoders #70

Abstract

Details

Personal Thoughts