Closed ben-8878 closed 3 years ago
Perplexity can't be compared across models with different vocabularies. In the extreme case, a model with only OOV will have perplexity 0 because everything maps to <unk>
and p(`
SRILM does `another hack'' where it turns off interpolation for unigrams, resulting in higher p(
) than many words have. If your test corpus has a high OOV rate this will make the perplexity lower. Conversely, it should make perplexity higher on a test corpus with a low OOV rate. It also has the strange effect that a system will prefer to generate
over words that exist in the vocabulary. If you want that behavior, pass
--interpolate_unigrams 0to
lmplz`.
@kpu it performs better with “--interpolate_unigrams 0”, i close the issue, thank you
train a 3gram srilm and kenlm with 30Gb text. train 4gram kenlm with follows parameter:
kenlm_opts="-o 4 --prune 0 0 1 1 -S 50% -T /data/temp"
206 sentences, 3277 words, 0 OOVs 0 zeroprobs, logprob= -10986.33 ppl= 1426.499 ppl1= 2251.937
srilm:206 sentences, 3277 words, 206 OOVs 0 zeroprobs, logprob= -9656.414 ppl= 884.5531 ppl1= 1394.401
kenlm use a better smoothing function, I don't know why it has more higher ppl.