Open YifanZuo opened 7 years ago
Now,I confirm that this implementation has two errors. One is the optimizer should be rmsprop but not adam. The other is that the final output must use fully-connection to get one scalar according to the original paper.
I have a question on the discriminator construction. I find the final number of channel is "1" via convolutional layer in this implementation. However, I find in others, e.g., "improved wgan", the final layer is fully-connection layer with the out dimension "1". So, which one is better? Indeed, I do not find any description of discriminator construction in the original paper (Wasserstein GAN).