Issues met when training DAGAN for new dataset

SUYEgit commented 6 years ago

Hi, I am trying to train the DAGAN for a new dataset, which has been transformed into the standard numpy format [10, 28, 64, 64, 3] (10 people and 28 pictures for each) I think this dataset is small enough. However, after I trained the model for over 200 epochs the loss of generator never converge. And images generated are not as expected. In the generated 16 by 16 images result, except for original images on the left top, all other images are the same, as shown below. I think the network didn't converge. And this is the loss I got. Could you please give me some suggestion/tricks on training the model? I will really appreciate!

AntreasAntoniou commented 6 years ago

Well, faces are very hard to model. Your main issue here is the fact that you are using too small a dataset I think. Try increasing the number of samples or give me some time to look into it. Furthermore, perhaps try using a smaller discriminator and generator.

On Thu, 12 Apr 2018, 08:24 SUYENus, notifications@github.com wrote:

Hi, I am trying to train the DAGAN for a new dataset, which has been transformed into the standard numpy format [10, 28, 64, 64, 3] (10 people and 28 pictures for each) I think this dataset is small enough. However, after I trained the model for over 200 epochs the loss of generator never converge. And images generated are not as expected. In the generated 16 by 16 images result, except for original images on the left top, all other images are the same, as shown below. I think the network didn't converge. [image: image] https://user-images.githubusercontent.com/35247113/38659647-fc2b9e1e-3e5c-11e8-96ae-68bbd235cb58.png And this is the loss I got. [image: image] https://user-images.githubusercontent.com/35247113/38659687-19b5d9fe-3e5d-11e8-8fde-3da92f5125a6.png Could you please give me some suggestion/tricks on training the model? I will really appreciate!

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/AntreasAntoniou/DAGAN/issues/2, or mute the thread https://github.com/notifications/unsubscribe-auth/AKSuNtABjYqs_51jtGSs8i8qDWCb_61Nks5tnvMTgaJpZM4TRNKP .

SUYEgit commented 6 years ago

Hi Antreas Antonious, thanks a lot for your quick reply. Since I don't have much GPU and I thought it would easier to train on small datasets. I will try increasing samples and see if it would work better. Could you please give me a basic sense of how long does it need to train the model?(with the information of samples size and GPU number you have used) I have looked in to your paper but it seems that you didn't mention that.

SUYEgit commented 6 years ago

Could any please help me? Still struggling with the issue of generating only one type of face.

AntreasAntoniou commented 6 years ago

When I trained the network I did about 50*4000 iters before the faces converged. How many epochs and iters per epoch are you doing?

On Sat, 14 Apr 2018, 02:45 SUYENus, notifications@github.com wrote:

Could any please help me? Still struggling with the issue of generating only one type of face.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/AntreasAntoniou/DAGAN/issues/2#issuecomment-381295201, or mute the thread https://github.com/notifications/unsubscribe-auth/AKSuNjvMfab4VO0KZuQSGeYisvKfl8fOks5toVTVgaJpZM4TRNKP .

SUYEgit commented 6 years ago

Hi Antreas Antonious, thanks a lot for your reply.

Since my dataset is small, I have been training with 200 epochs * 9 iterations. (each iteration means 5 iterations for discriminator and 1 for generator.) It's indeed small amount of training. But what's really confusing me is the trend that all pictures generated are all the same face. Could you help me with it? Is it due to a small dataset? Should I use other z-dimension?

AntreasAntoniou commented 6 years ago

Your results look like how the DAGAN generated things after 14000 iters in my eyes, experiments. It begins varying the faces after 54000 iters and produces photorealistic results after 30*4000. I think your problem is both the small amount of data and iters. However with enough iters even a small amount of data should produce some low quality results of varying faces but with poor generalization performance on unseen faces.

On Sat, 14 Apr 2018, 15:59 SUYENus, notifications@github.com wrote:

Hi Antreas Antonious, thanks a lot for your reply.

Since my dataset is small, I have been training with 200 epochs * 9 iterations. (each iteration means 5 iterations for discriminator and 1 for generator.) It's indeed small amount of training. But what's really confusing me is the trend that all pictures generated are all the same face. Could you help me with it? Is it due to a small dataset? Should I use other z-dimension?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/AntreasAntoniou/DAGAN/issues/2#issuecomment-381334937, or mute the thread https://github.com/notifications/unsubscribe-auth/AKSuNoDQJEO9gE4vPAsfEcJzsfpx2-Rlks5tog7QgaJpZM4TRNKP .

SUYEgit commented 6 years ago

Thanks a lot Antreas! I have increased the iterations to 200 per epoch and it finally starts to generate some variations. I hope it will work finally. I really appreciate your help.

SUYEgit commented 6 years ago

Hi Antreas Antoniou, I have been training the model and results are getting better and better. However, the loss curves are a bit confusing to me. (shown as following) I cannot understand why both the loss of discriminator and generator are diverging but results are getting better. Could you please help me to understand the loss curve, or tell me the loss function you have used for discriminator and generator? It could save a lot of time rather than fully digging into your codes... THANK YOU. (the order of loss curves from top to bottom are d_loss_fake, d_loss_real, d_losses, g_losses

AntreasAntoniou commented 6 years ago

I am using Wasserstein loss with Gradient Penalty. The important loss that correlates with sample quality is d_loss. The smaller it gets the better the sample quality becomes, approximately. The g_loss getting bigger is also explained in the Wasserstein with GP paper. It basically means that it becoming harder and harder to fool the discriminator.

On Tue, 17 Apr 2018, 06:52 Su Ye, notifications@github.com wrote:

Hi Antreas Antoniou, I have been training the model and results are getting better and better. However, the loss curves are a bit confusing to me. (shown as following) I cannot understand why both the loss of discriminator and generator are diverging but results are getting better. Could you please help me to understand the loss curve, or tell me the loss function you have used for discriminator and generator? It could save a lot of time rather than fully digging into your codes... THANK YOU. (the order of loss curves from top to bottom are d_loss_fake, d_loss_real, d_losses, g_losses [image: image] https://user-images.githubusercontent.com/35247113/38850886-f71d3986-4245-11e8-870a-cb4930e1c8fc.png

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/AntreasAntoniou/DAGAN/issues/2#issuecomment-381853544, or mute the thread https://github.com/notifications/unsubscribe-auth/AKSuNjfQuSd-RAjjWXFh8wUWxksTwT1Zks5tpYMwgaJpZM4TRNKP .

SUYEgit commented 6 years ago

Antreas thanks a lot for your explanation! Do you know what't happening when there is a steep change (the peak) in the middle of loss curve? It looks like suddenly the discriminator found out a way to discriminate fake images... It's quite interesting.

SUYEgit commented 6 years ago

Besides, regarding your saying "The important loss that correlates with sample quality is d_loss. The smaller it gets the better the sample quality becomes, approximately. " In the loss curve that I have showed you, the d_loss is negative value and going more negative. What do you mean by d_loss getting smaller? Getting close to zero or getting close to negative infinity? I have just looked though their paper and I saw the tiny experiments they took on LSUN, the d_loss of which is positive value. However, the d_loss I got is negative. Do you know why this happen? Is it because of you change the setting during implementation or there is sth wrong in my training? Could I conclude that this loss diverges because of limited data just as said in their paper?

I really appreciate your help.

AntreasAntoniou commented 6 years ago

If you have more data just like in my experiments what you will observe is: D loss starts positive goes negative and keeps becoming more and more negative. At some point it stops and then begins moving back towards 0. That's when the model converges. Initially it is still learning thus with every step becoming stronger and thus predicting a larger magnitude distance between real and fake.

On Tue, 17 Apr 2018, 15:23 Su Ye, notifications@github.com wrote:

Besides, regarding your saying "The important loss that correlates with sample quality is d_loss. The smaller it gets the better the sample quality becomes, approximately. " In the loss curve that I have showed you, the d_loss is negative value and going more negative. What do you mean by d_loss getting smaller? Getting close to zero or getting close to negative infinity? I have just looked though their paper and I saw the tiny experiments they took on LSUN, the d_loss of which is positive value. However, the d_loss I got is negative. Do you know why this happen? Is it because of you change the setting during implementation or there is sth wrong in my training? Could I conclude that this loss diverges because of limited data just as said in their paper?

I really appreciate your help.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/AntreasAntoniou/DAGAN/issues/2#issuecomment-382010934, or mute the thread https://github.com/notifications/unsubscribe-auth/AKSuNux7EnWUDDMhiko7XS7h9LZjKdQsks5tpfrTgaJpZM4TRNKP .

SUYEgit commented 6 years ago

Oh maybe I should increase the size of discriminator? It seems that in my training the discriminator totally lose the game.

AntreasAntoniou commented 6 years ago

Just pass arguments - - gen 2 and - - disc 5 to replicate my setup

On Tue, 17 Apr 2018, 16:16 Su Ye, notifications@github.com wrote:

Oh maybe I should increase the size of discriminator? It seems that in my training the discriminator totally lose the game.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/AntreasAntoniou/DAGAN/issues/2#issuecomment-382030658, or mute the thread https://github.com/notifications/unsubscribe-auth/AKSuNmADz1vOxohu_DCAssnQtoqAISCoks5tpgc-gaJpZM4TRNKP .

SUYEgit commented 6 years ago

Great! Thanks a lot.

Muhtasham commented 5 years ago

Thanks a lot Antreas! I have increased the iterations to 200 per epoch and it finally starts to generate some variations. I hope it will work finally. I really appreciate your help.

@AntreasAntoniou Hi there, how do we change the number of iterations in the code? Do we just edit self.disc_iter and self.gen_iter to a total of 200 in the experiment_builder.py? Thanks in advance

Long908 commented 3 years ago

Hi，May I ask how you make your own npy data set?

AntreasAntoniou / DAGAN

Issues met when training DAGAN for new dataset #2