Closed kellymarchisio closed 6 years ago
@kellymarchisio Firstly, Thanks for your contributions to this code. Actually, your understanding is right. However, the vocabulary file is a little different from vocab.bpe.32000 released by en-de wmt corpora in the artificial tokens, such as "PAD", "S", "S" and "UNK". These tokens are utilized for preparing the training data. You only need to add these four tokens manually at the begin of the vocab.bpe.32000. I checked the log file in our experiments again, I find that in our experiments, the discriminator achieves accuracy with 0.7 after 2-epochs training.
In my project, when running the discriminator pretraining, my loss also falls to 0-2, but the accuracy always oscillates around 0.5. I wonder what kind of problems may cause that? Thanks.
@jeicy07 In my project, the silly reason this behaviour was caused was because my pickled dictionary was built incorrectly. I fixed it to make sure it was a string:int
mapping of word:id
. When this was broken, my entire src/trg/neg matrices were written as 1
s (for UNK). It sounds like the behaviour you observe is symptomatic of indistinguishable matrices. Try logging the final matrices you feed into the discriminator, to see if you observe anything unusual. Then backtrack from there.
Thanks, I've tried to convert my pickled dictionary into a dict. Finally, it works!
@ZhenYangIACAS Thanks very much for your response. After fixing some errors, I also achieve accuracy 0.70 after 2 epochs. After how many epochs do you reach 0.82/0.95? I am still training (~epoch 4) but performance is still ~0.70.
I notice though that the accuracy bounces around quite significantly as seen here:
Is this expected, or a bug?
I also notice that loss alternates between very high values and lower values at the beginning of training:
Is this also expected, and what might cause this behavior? I would expect loss to monotonically decrease.
Thanks very much for releasing this code base - I've enjoyed working with it.
@ @kellymarchisio I am sorry for late response. Your loss is so strange that it varies significantly ranging from the upper bound and lower bound. In our experiments, the loss should decrease smoothly. Have you shuffled your training data?
@ZhenYangIACAS Yes, the training data is being shuffled. Now on Epoch 5, the model has begun to overfit. The peak was ~0.71-0.72 in earlier epochs. This config I'm using is below. Does anything look amiss here?
The training accuracy is now 0.75-0.90 per batch, but dev accuracy stays 0.61-0.71, as it was in epoch 2, except in epoch 2 the performance was more consistent.
@kellymarchisio There is no obvious error for your configuration.
@ZhenYangIACAS thanks for taking a look. To verify, should dis_positive_data, dis_negative_data, etc. look like regular sentences like:
This is a sentence .
Or do I have to pad the text file sent to the config so they look like:
<S> This is a sentence . </S> <PAD> <PAD> <PAD>...
I believe I've tried both, but your verification would be helpful.
@kellymarchisio You do not need to add the padding to the files manually. The code will do it automatically.
@ZhenYangIACAS Thank you for the clarification. According to your paper, I won't be able to reproduce the GAN training unless I get 82% accuracy. A few quick related questions:
I am sure that all of the parameters which show much effect on the translation performance are described detailed in our paper. We did not use pre-trained word embeddings. You can find the initialization method in our code. I remember that when we use the Transformer as the generator, the accuracy is hard to get to more than 90%. For your problem, it seems that a bug exists, but I am not sure.
@kellymarchisio How did you find dev accuracy in discriminator? I am using transformer as a generator. Code for finding dev accuracy is commented in cnn_discriminator.py file. I used this code but it shows several values in validation accuracy and they vary a lot.
@kellymarchisio Hi,Can you show me your data sample in config_discriminator_pretrain.yaml?for example,dis_positive_data dis_negative_data and so on. thanks
@ZhenYangIACAS Hi, I see many data sample in config_discriminator_pretrain.yaml,for example: dis_positive_data,dis_negative_data,dis_dev_positive_data and so on,Can you tell me what means about this data? What data do I need to prepare if Iwant to run the code successfully? Thanks!!!!!
@ @luckper dis_positive_data is the positive data for training the discriminator and dis_negative_data is the negative data for training the discriminator. dis_dev_positive_data is the development data for training discriminator, and so on...For understanding these files, I suggest that you should scan gan_train.py. Some files are what you should prepare beforehand, and some other files are generated automatically. I notice that so many files are a little messy for the users. We will re-construct our codes if we have free time.
@ZhenYangIACAS OK,pass your suggestion,I scan gan_train.py . However, I still have some questions. First, where dis_dev_positive_data , dis_dev_negative_data , dis_dev_source_data come from? And what is different from dis_positive_data,dis_negative_data,dis_source_data? Thanks!!!!!
@luckper I think it is easy to get the development data sets. We just randomly sampled 200 sentences from the dis_positive_data to get the dis_dev_positive_data, and similarly, we get the corresponding dis_negative_data and dis_source_data.
@ZhenYangIACAS Hi, I run the generate_sample.sh, but some errors have occurred:
Instructions for updating:
Use argmax
instead
using rmsprop for g_loss
Traceback (most recent call last):
File "generate_samples.py", line 60, in argmax
instead
INFO:root:using rmsprop for g_loss
@ZhenYangIACAS @luckper have you solved "KeyError: 'generator'" error.
I'm hoping for clarification on the files passed into config_discriminator.yaml.
As I understand it:
Is my understanding correct? Can you please provide clarification on how the files in the discriminator pretraining are created?
For context, I am trying to solve an issue where when running the discriminator pretraining, my loss falls to 2-3, but the accuracy always oscillates around 0.5 - even after 700K steps.