Abstract

Generative Adversarial Networks(GAN)은 데이터 생성에서 뛰어난 모습을 보이고 있다. 많은 영역에서 쓰이고 있지만 여전히 안정적인 학습에는 어려움이 따른다. 문제점으로는 Nash-equilibrium, internal covariate shift, mode collapse, vanishing gradient, evaluation metrics의 부족 등이 있다. 본 survey에서는 GAN학습을 안정하게 하기 위한 training solution을 몇 가지 소개한다.

original GAN model과 여기에서 파생된 version들을 소개한다.
여러 domain에서 적용된 GAN을 소개한다.
GAN training시 문제점들을 해결하기 위한 논문들을 소개한다.

Keyworkds - GAN, Variants, Applications, Training

1. Introduction

Generative Adversarial Networks(GAN)

Generative Adversarial Networks(GAN)은 최근 unsupervised learning 방식으로 학습하여 뛰어난 결과를 내는 generatvie model로 언급된다. GAN은 대부분 semi-supervised 과 unsupervised learning 으로 학습한다. GAN takes a supervised learning approach to do unsupervised learning by generating fake or synthetic looking data.

GAN은 generator network(G)과 discriminator network(D)를 동시에 학습한다. D는 input data가 real / fake를 판단하는 binary classifier를 학습한다. 반대로 G는 D가 실제 데이터처럼 착각하도록 데이터를 생성하도록 학습한다.

GAN의 이용

GAN은 많은 영역에서 이용된다.hand-written font generation, anime characters generation, image blending, image in-painting, face agiing, text synthesis, human pose synthesis, stenographic applications, image manipulation applications, visual saliency prediction, object detection, 3D image synthesis, medical applications, facial makeup transfer, facial landmark dectection, image super-resolution, texture synthesis, sketch synthesis, image-to-image translation, face frontal view generation, language and speech syntehsis, music generation, video applications

GAN가 가진 문제들

GAN을 통해서 매우 실제적인 sample을 얻을 수 있지만, 여전히 GAN을 안정적으로 학습하는 것은 어려운 일이다. 그 이유는 G와 D는 alternating gradient descent(Alt-GD)이나 simultaneous gradient descent(Sim-GD)를 통해서 최적화 되기 때문이다.
GAN의 구조는 shortcomings, Nash-equilibrium, internal covariate shift, mode collapse, vanishing gradient, and lack of proper evaluation metrics 문제를 가지고 있다.

이 문제들을 해결하기 위해서 feature matching, unrolled GAN, mini-batch discrimination, historical averaging, two time-scale update rule, hybridmodel, self-attention GAN, relativistic GAN, label-smoothing, sampling GAN, proper optimizer, normalization techniques, add noise to inputs, using labels, alternative loss fuctions, gradient penalty, and cycle-consistency loss 방법들이 소개된다.

1.1 Structure of this survey

Section 2 : GAN에 대해서 간단하게 소개, GAN으로 부터 확장된 네트워크를 살펴본다. [Table 1]
Section 3 : GAN가 AI의 여러 영역에서 사용되는 경우를 살펴본다. [Table 3]
Section 4 : GAN을 안정적으로 학습하는 방법과 관련된 연구를 살펴본다.
Section 5 : 이 survey의 결론을 언급한다.

Table 1

#	Subject	Pub	Year
1	Conditional Generative Adversarial Networks (CGAN)	arXiv	2014
2	Deep Convolution Generative Adversarial Networks (DCGAN)	CoRR	2015
3	Laplacian Generative Adversarial Networks (LapGAN)	NIPS	2015
4	Information Maximizing Generative Adversarial Networks (InfoGAN)	NIPS	2016
5	Energy-Based Generative Adversarial Networks (EBGAN)	ICLR	2016
6	Wasserstein Generative Adversarial Networks (WGAN)	ICML	2017
7	Boundary Equilibrium Generative Adversarial Networks (BEGAN)	CSLG	2017
8	Progressive-Growing Generative Adversarial Networks (PGGAN)	ICLR	2018
9	Big Generative Adversarial Networks (BigGAN)	ICLR	2019
10	Style-Based Generator Architecture for Generative Adversarial Networks	IEEE	2019

Table 3

Dom.	Subject	model names
Image	Handwritten font generation	DenseNet-CycleGAN / LS-CGAN / GlyphGAN
Image	Anime characters generation	DRA-GAN / PS-GAN
Image	Image blending	GP-GAN / GCC-GAN
Image	Image in-painting	Ex-GAN / PG-GAN
Image	Face aging	Age-CGAN / CAAE / IP-CGAN / Wavelet-GANs
Image	Text synthesis	AttnGAN / StackGAN / ACGAN / TACGAN / SISGAN
Image	Human pose synthesis	PG / Deformable-GAN
Image	Stenographic applications	S-GAN / SS-GAN / Stegano-GAN
Image	Image manipulation	IGAN / TAGAN / IAN / AttGAN / DGAN
Image	Visual saliency prediction	Sal-GAN / SalCapsule-CGAN / DSAL-GAN
Image	Object detection	SeGAN / PGAN / MTGAN / GANDO
Image	3D image synthesis	3DGAN / PrGAN
Image	Medical applications	SeGAN / MedGAN
Image	Facial makeup transfer	BGAN / PairedCycleGAN / DMTGAN
Image	Face landmark detection	SAN / Expose-GAN
Image	Image super-resolution	SR-GAN / ESR-GAN / SR-DGAN / T-GAN
Image	Texture synthesis	MGAN / SGAN /PSGAN
Image	Sketch synthesis	TGAN / Sketchy-GAN / CA-GAN
Image	Image-to-image translation	Pix2pix / PAN/ CycleGAN/ Disco-GAN / Dual-GAN / StarGAN / UNIT /MUNIT /DRIT / DRIT++
Image	Face frontal view generation	DRGAN / TPGAN / FFGAN /FTGAN
Audio	Speech and audio synthesis	Rank-GAN / VAW-GAN
Audio	Music generation	C-RNNGAN / Seq-GAN / ORGAN
video	Video applications	VGAN / MoCoGAN / DRNET / DVD-GAN

doublejy715 / Paper_review

A Survey on Generative Adversarial Networks : Variants, Applications, and Training #33