igul222 / improved_wgan_training

Code for reproducing experiments in "Improved Training of Wasserstein GANs"
MIT License
2.35k stars 669 forks source link

Number of critic iterations #9

Closed stefdoerr closed 7 years ago

stefdoerr commented 7 years ago

I am working on a 2D case similar to your toy examples but with a more complex distribution. I noticed big improvements in the contours (i.e. the energy surface learned by the discriminator) when increasing the critic iterations from 5 to 50.

I really think that 5 critic iterations is too low. I see you also use 5 iterations in the other examples like CIFAR and MNIST and is not showing the full potential of the network. The iterator should be given more time to converge.

After only 400 generator iterations I am already getting better results than the reported results in the paper for the swiss roll download

igul222 commented 7 years ago

You're right: the theory says that the critic should be trained to optimality at each step; in practice, the closer we get to optimal, the better. The tradeoff is that optimizing the critic for longer takes more time for each iteration. We picked 5 iterations because it was a good tradeoff: stable enough in most settings, but not terribly slow. Increasing this value might help for harder problems though.

Re. Swiss roll specifically, the results in the paper show the optimal critic (i.e. trained for 10,000 iterations) against a fixed "generator" (i.e. the generator distribution is held fixed at the data distribution plus Gaussian noise), so the plots aren't really comparable. That said we were able to train full WGAN-GPs on Swiss roll to full convergence (which it seems like your plot hasn't reached yet. How long did you train for?)

stefdoerr commented 7 years ago

Just 400 generator iterations (times 50 disriminator) as I say above :)

LukasMosser commented 7 years ago

Would it be possible to increase the sample count used for showing how well you approximate the distribution, looks quite patchy after 400 iterations.

stefdoerr commented 7 years ago

@LukasMosser Sure, you just need to modify the number of samples here https://github.com/igul222/improved_wgan_training/blob/master/gan_toy.py#L147

I only ran this to 400 iterations make the point :) Not very interested in letting it run all day for the full iterations. Maybe @igul222 will if he updates the paper with the fixed contours but he did the calculations differently by keeping the generator fixed so it's not entirely equivalent.

NickShahML commented 7 years ago

@stefdoerr big thanks for suggesting this. I too have found better results improving the critic iterations to 50. It takes about an eon to train but it does help nevertheless.