Closed YongfeiYan closed 4 years ago
I had another close look at both Texygen and RelGAN implementation in terms of nll_gen, but I didn’t any difference between them. Yes, RelGAN just literally re-used Texygen evaluation and both of them used training data to evaluate nll_gen, which, from my experience, is kind of consistent to the human observation. I strongly encourage you to try use the test data instead and see if there is any difference from using the training data to measure the NLL_gen. That will be great if it turns out that using test data is a better way of measuring the diversity.
Thanks for your reply and I will use test data to see the difference. One more question, did you use the default parameter configurations of TexyGAN in the experiments of MLE, SeqGAN etc??
Yes, I used the default settings in Texygen when evaluating previous models.
I noticed that NLL_gen in this repo was calculated using training data such as image_coco sentences as the real distribution. However, the training data was used to pretrain the generator. Is that correct? I think NLLgen = E{Pr test} P{\theta}(x) should be used instead of E_{Pr train} P{\theta}(x)