Decoding results - Githubissues

senarvi / theanolm

TheanoLM is a recurrent neural network language modeling tool implemented using Theano

Apache License 2.0

81 stars 29 forks source link

I've trained a language model with the default training parameters except that I used hierarchical softmax and batch size =64, the training data was huge set of about one million document. I've converted each document into a single in the training file. Then I used this language model to decode lattices for spelling correction where I provide candidates for each spelling error, but the results are frustrating for me. When I compared the choices of the LM with the baseline (where I choose the most frequent word in my data from the candidates manually) the LM has only 7 to 8 percent more accurate results. I inspected the results and most of the time the LM is picking random words over other obvious frequent correct words. I believe that Theanolm is more robust and accurate than what I got, and I think that maybe I'm doing something wrong in the training phase, or there are better combination of the training parameters than what I've used.

senarvi / theanolm

Decoding results #38