We have an application with a very small vocabulary (~100 words). With an almost trivial bigram model (as kenlm seems not to be able to make a unigram model), we see that decoder.decode() produces words that are not in the language model.
Is there some kind of fallback to letter decoding? Is there a way to turn this off?
Hello,
We have an application with a very small vocabulary (~100 words). With an almost trivial bigram model (as kenlm seems not to be able to make a unigram model), we see that
decoder.decode()
produces words that are not in the language model.Is there some kind of fallback to letter decoding? Is there a way to turn this off?
Thanks!