Regarding the synthetic data experiment, the nll-test metric is computed with the gen_data_loader (training data) so nll-test is actually nll-train (useless metric). Highly misleading.
Also, even if you create a separate test set (test_file.txt), the function generator.get_nll calls some training updates. This function is used to compute nll-test. So even in this case, you would still be training on test data!
I suspect this error occurs in other models also ...
I found this problem too, in SeqGAN of real data(image_coco, emnlp_news) and MLE training.
RelGAN may calculate the NLL using traning data instead of test data. see this
Regarding the synthetic data experiment, the nll-test metric is computed with the gen_data_loader (training data) so nll-test is actually nll-train (useless metric). Highly misleading.
Also, even if you create a separate test set (test_file.txt), the function generator.get_nll calls some training updates. This function is used to compute nll-test. So even in this case, you would still be training on test data!
I suspect this error occurs in other models also ...