aelnouby / Text-to-Image-Synthesis

Pytorch implementation of Generative Adversarial Text-to-Image Synthesis paper
GNU General Public License v3.0
404 stars 90 forks source link

Generator diverges? #10

Closed tiagd closed 6 years ago

tiagd commented 6 years ago

Hi,

I trained with gan_cls (not the vanilla but conditioned version) on flowers, with the shared hdf5 file, and I got curves of https://drive.google.com/open?id=1pASanOh9YUdg__I5OPRi_srmu3T2JYx8.

The discriminator loss keeps going down (then almost converge) to 0.483, but generator loss keeps up (not converge) to 18.58, and D(X) to 0.846, and D(G(X)) to 0.016. And I got similar image results on prediction as reported.

I think the curves I got during training suggests divergence, correct? If converge, we should see the generator loss also going down, and D(G(X)) up, correct?

Am I missing anything here? And do you have any suggestions to make the training converge? (I see that you've implemented many tricks from https://github.com/soumith/ganhacks#13-add-noise-to-inputs-decay-over-time. I'm trying #2 to flip real_label with fake_label for generator, doesn't seem help though).

Look forward to your answer. Thanks!

aelnouby commented 6 years ago

HI @tiagd ,

From my experience on this project and from other things i have read it seems that there is no real correlation between the training curves and images quality (which is the main goal) , Also please note that the objective function that i am using is not the standard one, I am adding L1 loss with the groundtruth and L2 loss with the activation (feature matching), so the convergence behavior you would expect might change but these changes improve images quality.

For more interpretable curves I would suggest using WGAN instead, the images quality should correlate with the decrease in the discriminator loss function, however WGAN takes a long time to train (may be use the gradient penalty to help with this).

tiagd commented 6 years ago

Very useful. Thanks aelnouby!

I have the same feeling that it seems that there is no close correlation between the training curves and images quality. There may be some for some examples, but not generally better.