Auto-encoders work well for continuous representations like images, but it suffers in discrete representations like texts, because encoders are smooth in its nature
Propose ARAE with the goal of more robust discrete-space representation
ARAE jointly trains both a rich discrete-space encoder, such as an RNN, and a simpler continuous space generator function, while using generative adversarial network (GAN) training to constrain the distributions to be similar
Using the latent variable model (ARAE-GAN), the model is able to generate varied sentences by moving around in the latent space via interpolation and offset vector arithmetic
Details
Related Works
A major difficulty of discrete autoencoders is mapping a discrete structure to a continuous code vector while also smoothly capturing the complex local relationships of the input space
VAEs usually learns identity mapping when the latent code space is free of any structure
a popular approach is to regularize through an explicit prior on the code space and use a variational approximation to the posterior, leading to a family of models called variational autoencoders (VAE)
Unfortunately VAEs for discrete text sequences can be challenging to train — if not carefully trained with techniques like word dropout and KL annealing, the decoder simply becomes a language model and ignores the latent code
Possible reasons are the strictness of the prior (usually a spherical Gaussian) and/or the parameterization of the posterior
Background
perfect and concise explanation of VAE and GAN
Adversarially Regularized Autoencoder
Model
a discrete autoencoder should be able to reconstruct x from c, but also smoothly assign similar codes c and c' to similar x and x'
target the smoothness issue, we learn a parallel continuous-space generator with a restricted functional form to act as a smoother reference encoding
The joint objective regularizes the autoencoder to constrain the discrete encoder to agree in distribution with its continuous counterpart
W is the Wasserstein-1 distance
Training
use a block coordinate descent to alternate between optimizing different parts of the model:
(1) the encoder and decoder to minimize reconstruction loss
(2) the WGAN critic function to approximate the W term
(3) the encoder and generator to adversarially fool the critic to minimize W
ARAE Training
Extension: Code Space Transfer
when you learn the encoder with attribute value (in multilingual setting, the language), and learn GAN such that code space is invariant to this attribute value, we can make semantic encoder that maps same semantics in different languages into closer code space, and be able to decode into any language in decoder using attribute flag
Is ARAE smooth?
compare with generator code
Is ARAE robust?
ARAE has lower reconstruction error when swaps are higher
ARAE-GAN
linear interpolation of sentences
adding attributes to sentence
Personal Thoughts
Good explanation, comparison of VAE and GAN
Lots of experiments, this is the style of Yoon Kim :)
Using smooth generator to regularize encoder is creative, but did not fully understand how that works
Sample Interpolation is interesting, but definitely less interesting than images
change in few words/tokens is not much interesting compared to visual changes in image
Style/Semantic Transfer has accuracy of 85%, which is too low for good transfer function
Abstract
Details
Related Works
A major difficulty of discrete autoencoders is mapping a discrete structure to a continuous code vector while also smoothly capturing the complex local relationships of the input space
identity
mapping when the latent code space is free of any structureBackground
Adversarially Regularized Autoencoder
x
fromc
, but also smoothly assign similar codesc
andc'
to similarx
andx'
W
is the Wasserstein-1 distanceW
termW
Extension: Code Space Transfer
Is ARAE smooth?
Is ARAE robust?
ARAE-GAN
Personal Thoughts
Link : https://arxiv.org/pdf/1706.04223.pdf Authors : Zhao et al. 2017