Open leo-p opened 7 years ago
Linked to recent work such as WGAN or Loss-Sensitive GAN that focus on objective functions with non-vanishing gradients to avoid the situation where the discriminator D
becomes too good and the gradient vanishes.
Thus they first introduce two targets for the discriminator D
and the generator G
:
And then the two new losses:
They use the DCGAN architecture and simply change the loss and remove the batch normalization and other empirical techniques used to stabilize training. They show that the soft-max GAN is still robust to training.
https://arxiv.org/pdf/1704.06191.pdf
Softmax GAN is a novel variant of Generative Adversarial Network (GAN). The key idea of Softmax GAN is to replace the classification loss in the original GAN with a softmax cross-entropy loss in the sample space of one single batch. In the adversarial learning of N real training samples and M generated samples, the target of discriminator training is to distribute all the probability mass to the real samples, each with probability 1/M, and distribute zero probability to generated data. In the generator training phase, the target is to assign equal probability to all data points in the batch, each with probability 1/(M+N). While the original GAN is closely related to Noise Contrastive Estimation (NCE), we show that Softmax GAN is the Importance Sampling version of GAN. We futher demonstrate with experiments that this simple change stabilizes GAN training.