NorbertZheng / read-papers

My paper reading notes.
MIT License
7 stars 0 forks source link

Sik-Ho Tsang | Review -- BiGAN: Adversarial Feature Learning (GAN). #53

Closed NorbertZheng closed 1 year ago

NorbertZheng commented 1 year ago

Sik-Ho Tsang. Review — BiGAN: Adversarial Feature Learning (GAN).

NorbertZheng commented 1 year ago

Overview

Bidirectional Generative Adversarial Networks (BiGANs): Learning the Inverse Mapping, from Image Space to Latent Space.

In this story, Adversarial Feature Learning, (BiGAN), by the University of California, and the University of Texas, is briefly reviewed. In this paper:

This is a paper in 2017 ICLR with over 1100 citations.

The idea is the same as ALI while they are proposed independently and published in the same conference (2017 ICLR). Some papers would cite both BiGAN and ALI together when talking about this idea.

NorbertZheng commented 1 year ago

BiGAN: Overall Structure

image BiGAN: Overall Structure.

$$ \min{G,E}\max{D}V(D,E,G) $$

$$ V(D,E,G):=\mathbb{E}{x \sim p(x)}[\underbrace{\mathbb{E}{z \sim p{E}(\cdot|x)}[logD(x,z)]}{logD(x,E(x))}]+\mathbb{E}{z \sim p(z)}[\underbrace{\mathbb{E}{x \sim p{G}(\cdot|z)}[log(1-D(x,z))]}{log(1-D(G(z),z))}]. $$

A model trained to predict features $z$ given data $x$ should learn useful semantic representations. BiGAN objective forces the encoder $E$ to do exactly this.

In order to fool the discriminator at a particular $z$, the encoder must invert the generator at that $z$, such that $E(G(z))=z$, which is exactly $L_{g}$ item in TEM doing!

NorbertZheng commented 1 year ago

Experimental Results

Permutation-Invariant MNIST

image One Nearest Neighbors (1NN) classification accuracy (%) on the permutation-invariant MNIST test set in the feature space.

All methods, including BiGAN, perform at roughly the same level. This result is not overly surprising given the relative simplicity of MNIST digits. image Qualitative results for permutation-invariant MNIST BiGAN training, including generator samples $G(z)$, real data $x$, and corresponding reconstructions $G(E(x))$.

Digits generated by the generator $G$ in nearly perfectly match the data distribution (qualitatively, e.g. pixel-level), as shown above.

NorbertZheng commented 1 year ago

ImageNet

image Qualitative results for ImageNet BiGAN training, including generator samples $G(z)$, real data $x$, and corresponding reconstructions $G(E(x))$.

As shown above, the reconstructions, while certainly imperfect, demonstrate empirically that the BiGAN encoder $E$ and generator $G$ learn approximate inverse mappings.

image Classification accuracy (%) for the ImageNet LSVRC validation set.

BiGAN is competitive with these contemporary visual feature learning methods.

NorbertZheng commented 1 year ago

PASCAL VOC

image Classification and Fast R-CNN detection results for the PASCAL VOC 2007 test set and FCN segmentation results on the PASCAL VOC 2012 validation set.

NorbertZheng commented 1 year ago

Reference

[2017 ICLR] [BiGAN] Adversarial Feature Learning.

NorbertZheng commented 1 year ago

Generative Adversarial Network (GAN)

Image Synthesis [GAN] [CGAN] [LAPGAN] [AAE] [DCGAN] [CoGAN] [SimGAN] [BiGAN] Image-to-image Translation [Pix2Pix] [UNIT] Super Resolution [SRGAN & SRResNet] [EnhanceNet] [ESRGAN] Blur Detection [DMENet] Camera Tampering Detection [Mantini’s VISAPP’19] Video Coding [VC-LAPGAN] [Zhu TMM’20] [Zhong ELECGJ’21]