Closed NorbertZheng closed 1 year ago
Not Only Mapping from Latent Space to Data Space, But Also Mapping from Data Space to Latent Space, Outperforms DCGAN.
Inthis story, Adversarially Learned Inference, (ALI), by Université de Montréal, Stanford, New York University, and CIFAR Fellow, is briefly reviewed. In this story:
This is a paper in 2017 ICLR with over 1000 citations.
The idea is the same as BiGAN, but they are proposed independently and published in the same conference (2017 ICLR). Some papers would cite both ALI and BiGAN together when talking about this idea.
The adversarially learned inference (ALI) game.
The loss function is as follows: Joint pairs $(x, z)$ are drawn either from $q(x, z)$ or $p(x, z)$, and a discriminator network learns to discriminate between the two, while the encoder and decoder networks are trained to fool the discriminator.
The bidirectional GAN. If we treat $G{z}(x)$ and $G{x}(z)$ in ALI as encoder $E$ and decoder(generator) $G$ respectively, it is a bidirectional GAN (BiGAN).
Unlike the GAN where the discriminator sees only $x$ as input, in the BiGAN/ALI, D sees both $x$ and $z$ , i.e., the observation and its latent representation together.
For a true sample, $x$ is given (it is taken from the training set) and the corresponding $z$ is generated by the encoder $E$.
For a fake sample, $z$ is given (it is sampled from $p(z)$ ) and its corresponding $x$ is generated by the generator $G$.
Once training is complete, just like we can use the generator to predict $x$ for new $z$, we can use the encoder to predict $z$ for any $x$.
Samples and reconstructions on the SVHN dataset.
Samples and reconstructions on the CelebA dataset.
Samples and reconstructions on the CIFAR10 dataset.
Samples and reconstructions on the Tiny ImageNet dataset.
Maybe we can use such a ALI training objective during replay process to improve wakesleep replay mode?
Latent space interpolations on the CelebA validation set.
SVHN test set missclassification rate.
CIFAR10 test set missclassification rate for semi-supervised learning using different numbers of trained labeled examples.
It is conjectured that the latent representation learned by ALI is better untangled with respect to the classification task and that it generalizes better.
Conditional generation sequence.
The corresponding loss function is as follows:
[2017 ICLR] [ALI] Adversarially Learned Inference.
Image Synthesis [GAN] [CGAN] [LAPGAN] [AAE] [DCGAN] [CoGAN] [SimGAN] [BiGAN] [ALI] Image-to-image Translation [Pix2Pix] [UNIT] Super Resolution [SRGAN & SRResNet] [EnhanceNet] [ESRGAN] Blur Detection [DMENet] Camera Tampering Detection [Mantini’s VISAPP’19] Video Coding [VC-LAPGAN] [Zhu TMM’20] [Zhong ELECGJ’21]
Sik-Ho Tsang. Review — ALI: Adversarially Learned Inference (GAN).