Closed kaijfox closed 7 years ago
That's exactly what I do. In this case, y_true = 1 or -1. So K.mean(y_true * y_pred) is the same as the first part of the equation.
I give more details on the way I optimize this GAN here: https://github.com/tdeboissiere/DeepLearningImplementations/tree/master/WassersteinGAN/src/model
The approximate Wasserstein loss you define is:
But in the paper (Algorithm 1) they optimize: &space;-&space;\frac{1}{m}\sum^{m}_{i=1}fw(g\theta(x^{(i)}))) Where x is the batch of real images, z is the noise input to the generator g, and f is the critic.
Your objective seems to function just fine, since the network clearly learns well. I'm confused where the multiplication is coming from though. Could you explain that part of your code?