Closed chenwq95 closed 5 years ago
What's your setting in the experiment? We use TF 1.4. And how did you conduct BLEU evaluation? Do you use the evaluation program in the TexyGen?
I think it is due to the default smoothing function.
Before nltk < 3.3 (released 2018.05.06), there was a bug in the SmoothingFunction().method0
(i.e. no smoothing),
which was the default smoothing function of sentence_bleu
That bug led to incorrect high values of BLEU for cases with no matched n-gram (common when n is high).
(See: https://github.com/nltk/nltk/pull/1545)
Texygen uses SmoothingFunction().method1
by default: using 0.1 as the numerator in precision instead of 0 if there's no matched n-gram,
so it is unaffected by the bug regardless of NLTK version.
The BLEU score of LeakGAN is far lower than that reported in the paper.
How did you get the BLEU score? Thanks
The BLEU score of LeakGAN is far lower than that reported in the paper.