They use DeepPy which seems the author's original deep learning framework..
Author/Institution
Anders Boesen Lindbo Larsen, Søren Kaae Sønderby, Hugo Larochelle, Ole Winther
Technical University of Denmark, University of Copenhagen, Twitter
What is this
Combine VAEs and GANs.
Propose to use learned feature representations in the GAN discriminator as basis for the VAE reconstruction objective.
Thereby, replace element-wise errors with feature-wise errors.
Moreover, show that the network is able to disentangle factors of variation in the input data distribution and discover visual attributes in the high-level representation of the latent space.
Comparison with previous researches. What are the novelties/good points?
Why not just using VAE?
Pixel-wise metric (e.g. MSE) is not appropriate for images. Discriminator brings another approach into here
Why only GAN doesn't make sense?
GAN doesn't have encoding ability. GAN itself will not be enough for some purpose
Key points
Collapse the VAE decoder and the GAN generator into one
Share parameters
Train jointly
Replace element-wise reconstruction metric with a feature-wise etric expressed in the discriminator
The loss is consists of these three different losses
$L_{prior}$: KL from VAE
$L^{Disl}{llike}$: reconstruction error expressed in the GAN discriminator
$L_GAN$: standard GAN loss function = $log(Dis(x)) + log(1-Dis(Gen(z)))$
Algorithm
How the author proved effectiveness of the proposal?
Conducted experiments with CelebA dataset and showed that the generative models trained with learned similarity measures produced better image samples than models trained with element-wise error measures.
Any discussions?
How is performance in terms of computational cost?
How to determine when to finish GANs training? (maybe need to check the code)
What should I read next?
Note
How is performance. It may faster or computationaly less expensive compare to MSE?
Summary
Link
Autoencoding beyond pixels using a learned similarity metric Official implementation
Author/Institution
Anders Boesen Lindbo Larsen, Søren Kaae Sønderby, Hugo Larochelle, Ole Winther Technical University of Denmark, University of Copenhagen, Twitter
What is this
Combine VAEs and GANs.
Propose to use learned feature representations in the GAN discriminator as basis for the VAE reconstruction objective. Thereby, replace element-wise errors with feature-wise errors.
Moreover, show that the network is able to disentangle factors of variation in the input data distribution and discover visual attributes in the high-level representation of the latent space.
Comparison with previous researches. What are the novelties/good points?
Key points
Algorithm
How the author proved effectiveness of the proposal?
Conducted experiments with CelebA dataset and showed that the generative models trained with learned similarity measures produced better image samples than models trained with element-wise error measures.
Any discussions?
How is performance in terms of computational cost? How to determine when to finish GANs training? (maybe need to check the code)
What should I read next?
Note
How is performance. It may faster or computationaly less expensive compare to MSE?