Closed vince62s closed 8 years ago
to illustrate my point: Read the vocabulary: 149999 words Restoring existing nnet Constructing RNN: layer_size=400, layer_type=sigmoid, layer_count=1, maxent_hash_size=1999936667, maxent_order=4, vocab_size=149999, use_nce=0 Contructed HS: arity=2, height=28 Test entropy 6.834538 Perplexity is 114.13
Read the vocabulary: 149999 words Restoring existing nnet Constructing RNN: layer_size=400, layer_type=sigmoid, layer_count=1, maxent_hash_size=1999936667, maxent_order=4, vocab_size=149999, use_nce=1 Constructing NCE: layer_size=400, maxent_hash_size=1999936667, cuda=0, ln(Z)=9.000000 Use -nce-accurate-test to calculate entropy Perplexity is 123.375
Anton,
I am running the toolkit on the text from the Cantab-Tedlium recipe in Kaldi. it a text file of about 900 MB, vocab size 150K Anyway my question is: do you think it's normal to get a 114 perplexity in HS and 124 in NCE mode ? I would have expected the NCE results better than the HS ones according to your home page.
(parameters are the one from the WSJ recipe for rnnlm)
thanks Vincent