Different perplexity results from estimate-ngram and evaluate-ngram

What steps will reproduce the problem?
1. Create an lm with evaluate-ngram and eval-perp param
2. Use estimate-gram with eval-perp on the same LM
3. Perplexity results differ

What is the expected output? What do you see instead?

evaluate-ngram -lm rlst8-similar.lm -eval-perp "$TRANSCRIPT_CONT, 
$TRANSCRIPT_SENT"
0.001   Loading LM rlst8-similar.lm...
7.262   Perplexity Evaluations:
7.262   Loading eval set 
/data/src/sphinx/experiments/transcripts/rlst-transcript.corpus...
7.318       /data/src/sphinx/experiments/transcripts/rlst-transcript.corpus 385.071
7.322   Loading eval set 
/data/src/sphinx/experiments/transcripts/rlst-transcript.sentences...
7.376       /data/src/sphinx/experiments/transcripts/rlst-transcript.sentences  312.22
4

$  estimate-ngram -unk 1 -vocab $VOCAB_AUGMENTED -text $SENTENCE_CORPUS -wl 
$LM_SIMILAR -eval-perp "$TRANSCRIPT_CONT, $TRANSCRIPT_SENT"
0.001   Replace unknown words with <unk>...
0.001   Loading vocab rlst8-merged-vocab.txt...
0.013   Loading corpus sentences.similar.corpus...
    10.127  Smoothing[1] = ModKN
10.127  Smoothing[2] = ModKN
10.127  Smoothing[3] = ModKN
10.127  Set smoothing algorithms...
10.243  Estimating full n-gram model...
10.459  Saving LM to rlst8-similar.lm...
14.192  Perplexity Evaluations:
14.192  Loading eval set 
/data/src/sphinx/experiments/transcripts/rlst-transcript.corpus...
14.351      /data/src/sphinx/experiments/transcripts/rlst-transcript.corpus 377.913
14.359  Loading eval set 
/data/src/sphinx/experiments/transcripts/rlst-transcript.sentences...
14.516      /data/src/sphinx/experiments/transcripts/rlst-transcript.sentences  307.0
90

I would expect the two sets of perplexity results to be the same.

The difference appears to arise from use of the "-unk" parameter. Without these 
(i.e. LM excludes <unk>), the perplexity results from estimate-ngram and 
evaluate-ngram are the same.

What version of the product are you using? On what operating system?

r48

MacOS X 10.6.1

Please provide any additional information below.

Original issue reported on code.google.com by smarqu...@gmail.com on 4 Jun 2011 at 7:42

askender / mitlm

Different perplexity results from estimate-ngram and evaluate-ngram #28