Open florian-boehm opened 6 years ago
That starts to look a lot like Wasserstein GAN (see e.g https://arxiv.org/abs/1704.00028). They also propose additional loss terms to limit the gradient magnitudes in D.
Thank you for pointing this paper out to me. If I have understood it correctly, this one point is worth mentioning:
In case of WGAN the activation function in the last layer of the discriminator should be linear and because the output can not be interpreted as probability anymore the discriminator is then called critic.
Hello, I wonder if the following simplifications lead in practice to the same result as the original loss functions:
Thank you very much for your help!
Florian