Open wagamamaz opened 7 years ago
It's hard to say!
Ok this is for an unconditional boilerplate GAN. What I found for loss increase in G was that: a) it was accompanied by a decrease in D loss. Essentially G starts diverging. b) image quality improved subtle but it did.
I think the discriminator got too strong relative to the generator. Beyond this point, the generator finds it almost impossible to fool the discriminator, hence the increase in it's loss. I'm facing a similar problem.
Have you tried label smoothing @vijayvee ?
No I haven't tried it yet @LukasMosser
I am facing similar problem while training infoGAN on svhn dataset. Any suggestion on how to overcome this?
I am also facing similar problem with Infogan on a different dataset. Any suggestions?
In my experience, when d loss decrease to a small value (0.1 to 0.2) and g loss increase to a high value (2 to 3), it means the training finish as generator cannot be further improved.
Bit if the d loss decrease to a small value in just few epochs, it means the training fail, and you may need to check the network architecture.
I have the same problem. When i train GAN, i expect that in the end of training(some infinite moment) G will always fool D. But in fact I am faced with the following problem: at the beginning of the process, G learns correctly - it learns to produce good images with nessesary conditions. But after some moment G starts to diverge. In the end, G could produce only random noise. Why this happens?
Probably, the problem is that the discriminator overfit. One of the reasons leading to this is following thing: discriminator may "notice" that images from true distribution is a matrix of numbers of the form n/255. So, adding gaussian noise to the input images may help to avoid the problem. It helps in my case.
Label switching has also helped for me.
Two updates of discriminator with real_label = 1, fake_label=0 and one update with real_label=0 and fake_label=1.
This is followed by one generator update with real_label = 1 and fake_label = 0.
Label smoothing helped for me.
Adding gaussian noise helped for me
@Howie-hxu and @EvgenyZamyatin : I saw that adding Gaussian noise in the discriminator helped in your case. I have few questions :
Keenly waiting for your help !!! Thanks, Avisek
Same doubt here
Same doubts as yours. @avisekiit
I have used the idea of instance noise described here. My experiment was to add the Gaussian noise only to the input tensor of the discriminator. It was zero mean and its standard deviation ranges from 0.1 to 0 (i.e. decaying with each mini batch iteration). This has improved the result much better for the MNIST dataset.
Thank you!I'll try it @ahmed-fau
Hello. I am training CycleGAN and my loss looks like that attached picture. The discriminator loss decreases but the generator loss fluctuates. I do not quite understand the reasons. Are there anyone have any suggestions? Thanks
Adding noise to input seems to help. To be specific, i am implementing with tensorflow by adding:
input = input + tf.random_normal(shape=tf.shape(input), mean=0.0, stddev=0.1, dtype=tf.float32)
I agree here that by adding noise to the discriminator loss model function, it does improves your generator loss to decrease. @ahmed-fau suggested very good tips.
Hi, I tried what you guys did, adding gaussian noise to the input of the discriminator. It does improve the graph, but however the test images generated by the generator comes out as noise as well. (previously I have relatively ok images, but my generator loss fn was going up).
Thoughts?
Hi, I tried what you guys did, adding gaussian noise to the input of the discriminator. It does improve the graph, but however the test images generated by the generator comes out as noise as well. (previously I have relatively ok images, but my generator loss fn was going up).
Thoughts?
Did you also have decay of the noise after a while?
@EvgenyZamyatin adding noise to input helped, great thanks
I am facing a similar problem while using WGAN-GP. The generator initially produces good results but seems to diverge after some time and the discriminator loss suddenly dips and becomes very powerful making the generator output random noise. What can be done instead of label smoothing since I am using WGAN?
@aradhyamathur could try adding a penalty loss term for the discriminator output magnitude, similarly to https://github.com/tkarras/progressive_growing_of_gans
This helps to prevent a training dynamic where the models engage in a "magnitudes race" and eventually lose any meaningful learning signals.
@phamnam95 That looks like typical cycleGAN loss. What is your batchsize? If it is one or 2, there will be lots of fluctuations in your objective function. Seen it before, looks pretty normal to me.
@LukasMosser My batch size is 1. After adding some more constraints such as the identity loss, the self-distance loss, and I also semi-supervised cyclegan by using pair images, I can get generator decreases but not fast, instead it decreases very slowly and seems that after 200 epochs, the trend is still decreasing. And the discriminator decreases until it reaches a certain of epochs, it starts fluctuate. What do you think will be good? What batch size do you think is appropriate?
Hi, I tried what you guys did, adding gaussian noise to the input of the discriminator. It does improve the graph, but however the test images generated by the generator comes out as noise as well. (previously I have relatively ok images, but my generator loss fn was going up).
Thoughts?
Hi, I have the same problem? Can you solve it?great thanks
@phamnam95 I think batch size = 1 is ok, I'm not really worried about the fluctuation it just means you'll have to pick one with appropriate generator loss and not one where it seemingly diverged.
Hello everyone actually I am working on Project where I am generating hard triplets using GAN. But I am using a food dataset which has 20 different class label and each class label has 200 images. My discriminator is predicting fake label in most of the cases even for real feature embedding. How to deal with this problem.
My model contains a Feature extractor, a generator, and a discriminator
Triplet dataset here means ( an anchor image, positive image, negative image) In my model I am passing this triplet dataset to feature_extractor and getting 3 embeddings Now these three embeddings are passed to the generator to get hard triplets from the generator using triplet loss. Then embeddings from feature_extractor are passed as real embedding to the discriminator. generator passes hard triplet embeddings to discriminator as a fake dataset.
My problem is that discriminator is predicting feature_extractor_real_embedding as well as generator_fake_embeddings as fake most of the time. I am working on this research paper (http://openaccess.thecvf.com/content_ECCV_2018/papers/Yiru_Zhao_A_Principled_Approach_ECCV_2018_paper.pdf) Can anyone suggest me how to deal with this problem? Please @LukasMosser @vijayvee
Hello guys, I trained a GANs to generate more data for my EEG Motor Imagery classification tasks. Due to the unstable and noise-influenced reasons, how could I know the dataset GANs generated is belong to a certain category rather than others when I only use a certain class of data as the input of ground truth?
Did you try Batch Normalization for discriminator? It helped me out when I encountered the same problem.
Hi, add noise really works for me.
Hey all,
Just with playing with GANs obsessively for a few weeks now, I've started to notice two distinct collapse modes:
D overpowers G. G does not change (loss roughly static) while D slowly, steadily goes to 0.
In this case, adding dropout to any/all layers of D helps stabilize.
Another case, G overpowers D. It just feeds garbage to D and D does not discriminate. This one has been harder for me to solve! Adding noise G in ALL layers, with gradual annealing (lowering noise slightly each iteration) was the solution.
A third failure state, when G and D are roughly balanced but D is more consistent; occasional "spikes" come along associated with very high gradient norms. These come with dramatic updates to G; indicate to me to increase regularization on D so we get more frequent, less dramatic updates to G.
Label switching has also helped for me.
Two updates of discriminator with real_label = 1, fake_label=0 and one update with real_label=0 and fake_label=1.
This is followed by one generator update with real_label = 1 and fake_label = 0.
Where did you guys implemented the label smoothing? After a number of iterations inside the training? Can you be more specific with that, please? I am trying to implement the label smoothing in my dcgan code, but I am not sure about where to implement it. Thank you.
Hey all,
Just with playing with GANs obsessively for a few weeks now, I've started to notice two distinct collapse modes:
D overpowers G. G does not change (loss roughly static) while D slowly, steadily goes to 0.
In this case, adding dropout to any/all layers of D helps stabilize.
Another case, G overpowers D. It just feeds garbage to D and D does not discriminate. This one has been harder for me to solve! Adding noise G in ALL layers, with gradual annealing (lowering noise slightly each iteration) was the solution.
A third failure state, when G and D are roughly balanced but D is more consistent; occasional "spikes" come along associated with very high gradient norms. These come with dramatic updates to G; indicate to me to increase regularization on D so we get more frequent, less dramatic updates to G.
Hi, how did you implemented the second solution (adding noise G with gradual annealing)? Could you help me, please?
Probably, the problem is that the discriminator overfit. One of the reasons leading to this is following thing: discriminator may "notice" that images from true distribution is a matrix of numbers of the form n/255. So, adding gaussian noise to the input images may help to avoid the problem. It helps in my case.
Thank you for your insight. I have never think about discriminator's overfit in this way.
I think the discriminator got too strong relative to the generator. Beyond this point, the generator finds it almost impossible to fool the discriminator, hence the increase in it's loss. I'm facing a similar problem.
Thank u, I just use a 3-layers mlp as D, and the hidden layer size is very small. However, the capacity of D is also strong, what can i do ?
hi, I am also facing situation where discriminator loss goes to 0 (for both fake image and real image) and generator loss keeps increasing. Any idea how to solve it? I suspect whether it's because of discriminator is too strong, learning too fast?
Hey all,
Just with playing with GANs obsessively for a few weeks now, I've started to notice two distinct collapse modes:
D overpowers G. G does not change (loss roughly static) while D slowly, steadily goes to 0.
In this case, adding dropout to any/all layers of D helps stabilize.
Another case, G overpowers D. It just feeds garbage to D and D does not discriminate. This one has been harder for me to solve! Adding noise G in ALL layers, with gradual annealing (lowering noise slightly each iteration) was the solution.
A third failure state, when G and D are roughly balanced but D is more consistent; occasional "spikes" come along associated with very high gradient norms. These come with dramatic updates to G; indicate to me to increase regularization on D so we get more frequent, less dramatic updates to G.
If gan loss is static and discriminator loss is go down, this means your generator can handle fake data even when the discriminators improve. Thus, if the generator loss does not change and the discriminator error falls, your model improves.
Hey all,
Just with playing with GANs obsessively for a few weeks now, I've started to notice two distinct collapse modes:
D overpowers G. G does not change (loss roughly static) while D slowly, steadily goes to 0.
In this case, adding dropout to any/all layers of D helps stabilize.
Another case, G overpowers D. It just feeds garbage to D and D does not discriminate. This one has been harder for me to solve! Adding noise G in ALL layers, with gradual annealing (lowering noise slightly each iteration) was the solution.
A third failure state, when G and D are roughly balanced but D is more consistent; occasional "spikes" come along associated with very high gradient norms. These come with dramatic updates to G; indicate to me to increase regularization on D so we get more frequent, less dramatic updates to G.
Hi, there. Thanks for your post And thank for @Alexey322 's advice ! They are very useful!!! I am facing the third failure state? that is, even G and D loss goes balanced all the time, but still some peaks occurs when training. Is it means that still failed training? Could you please tell me how to apply more regularization on D? Thanks a lot!
Hi, I am training a conditional GAN. At the beginning, both G and D loss decrease, but around 200 epoch, G loss start to increase from 1 to 3, and the image quality seems to stop improve.
Any ideas? Thank you in advance.