soumith / dcgan.torch

A torch implementation of http://arxiv.org/abs/1511.06434
Other
1.45k stars 413 forks source link

Possible reasons for similarity among generated images #49

Open vasavig opened 7 years ago

vasavig commented 7 years ago

Hi,

I have been playing around with the DCGAN architecture and I have a question on the similarity among the generated images.

I trained the network on a 140 dimensional vectors sampled from normal(0,1). The results were good after a few epochs, and they look varied too. The generator's output looks like the following: 25

I modified the above network to take a table of inputs (one 100 dimensional vector and one 40 dimensional vector - both sampled from normal(0,1)) through a parallel table and joined them to make a 140 dimensional vector. The following are some results: output_2

The above two networks are essentially the same because the parameter learning happens only in layers following the join table in 2nd network, and this part of network architecture is the same for both. But the results are more varied in the first network, and there are lots of similar pictures in the 2nd output.

I have observed this on other datasets too. According to my understanding, after training the DCGAN, the generator learns a mapping from Z-vector space to images. Is there a possibility of the generator learning only a "certain set" of images (which are not necessarily in training set, so there's no overfitting) for the whole Z-vector distribution, and output only those images for various Z vector inputs? It would be great if anyone can shed some light on why this might happen.

Thanks!

soumith commented 7 years ago

are your second set of generations on a pre-trained network?

vasavig commented 7 years ago

No, I had trained both the networks from scratch on Celeb dataset, with the only difference of how I created the input to generator, as I explained in my comment.

vasavig commented 7 years ago

Seems like this is a mode collapse problem where the generator is modeling the distribution to one Gaussian rather than a mixture of Gaussians. More pointers on the mode collapse problem can be found in Ian Goodfellow's 2016 NIPS tutorial, https://arxiv.org/abs/1701.00160.

danieleghisi commented 7 years ago

@vasavig Did you find a way to overcome this problem?

danieleghisi commented 7 years ago

In this paper the authors claim to have solved the mode collapse issue via "unrolling" the Adam objective. I wouldn't be able to implement any of this in the current model, but it might be cool to consider it...?