Open Anshdeep-Singh opened 3 years ago
Regarding the issues you have, I think you can try the followings:
rmc_vdcnn
architectures in different ways, but rmc_vanilla
simply outperforms it in the most cases. Have you tried rmc_vanilla
in your case? Does it have the same issues as you observed?rmc_vanilla
also has this issue, I think you may want to make sure a) the model is saved in the exact training iteration (i.e., when the generated samples are good), b) the model is saved properly, i.e., all the variables have been saved instead of some being reinitialized during model loading. Hope these suggestions could help.
So I tried running it with it rmc_vanilla, though the generator loss and discriminator loss was better than before and was reducing at the same pace, yet the results of rmc_vanilla were worse than rmc_vdcnn in both adversarial as well as testing samples. Here's an example : 1) adversarial training model (50 epochs) epoch 50 - > g_loss 0.8, d_loss 0.8, nll_gen = 0.7
2) adversarial training model (500 epochs) epoch 500 -> g_loss 0.3, d_loss 0.4, nll_gen = 1.4
3) Testing Sample Testing sample nll_gen = 11
The above outputs are what I got using rmc_vanilla.
I even tried different values of beta_max and yet the output was similar.
Hope you doing great, We are training your code on StoryCloze dataset which has 5 sentence stories and we intend to generate stories for example:
Today is Sarah's 5th birthday! Her parents threw her a party at her favorite Mexican restaurant. A balloon artist made balloon hats for Sara and her friends. Sara got lots of presents. The birthday party was a great success.
So we are currently facing few problems: 1) During Adversarial training the gen_loss is initially at 0.25 for the first 100 epochs but after 100 it suddenly goes to 0.999 and keeps fluctuating there while the dis_loss keeps reducing.
2) Samples generated during adversarial training are good but when I try to generate new samples using the saved model the quality of the sample reduces and nll_gen goes to 10 though during adversarial training it was around 0.6 . We tried different parameters value including the default ones but still when generating samples the nll_gen shoots up to 10 -12. We ran the code in rmc_vdcnn.
We tried different parameters: batch_size = 32 or 64 gen_emb_dim = 32 dis_emb_dim = 64 mem_slots = 1 head_size = 512 num_heads = 2 gf_dim = 64 or 32
df_dim = 64 or 32
gsteps = 1 or 3 dsteps = 5 npre_epoch = 200 or 150 navd_steps = 100 or 150 dlr = 1e-6 gpre_lr = 1e-2 -> 1e-4 glr = 1e-4 -> 1e-4 beta max (temperature) = 1000 -> 1000
Do you have any solutions to the above problems