malcolmgreaves / berkeleylm

Automatically exported from code.google.com/p/berkeleylm
Apache License 2.0
1 stars 0 forks source link

Cannot train unigram model with Kneser-Ney #16

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
When I try to train a unigram Kneser-Ney model, I get the exception below. This 
is the offending line:

dotdotTypeCounts = new LongArray[maxNgramOrder - 2];

here is the exception:

Exception in thread "main" java.lang.NegativeArraySizeException

at 
edu.berkeley.nlp.lm.values.KneserNeyCountValueContainer.<init>(KneserNeyCountVal
ueContainer.java:85)

at 
edu.berkeley.nlp.lm.io.KneserNeyLmReaderCallback.<init>(KneserNeyLmReaderCallbac
k.java:123)

at 
edu.berkeley.nlp.lm.io.LmReaders.createKneserNeyLmFromTextFiles(LmReaders.java:3
01)

at 
edu.berkeley.nlp.lm.io.LmReaders.readKneserNeyLmFromTextFile(LmReaders.java:283)

at 
edu.berkeley.nlp.lm.io.LmReaders.readKneserNeyLmFromTextFile(LmReaders.java:272)

at dragon.lm.NGramLanguageModel.<init>(NGramLanguageModel.java:85)

at 
dragon.ml.NaiveBayesClassifier.initalizeLanguageModels(NaiveBayesClassifier.java
:154)

at dragon.ml.NaiveBayesClassifier.main(NaiveBayesClassifier.java:189)

Original issue reported on code.google.com by b...@parakhi.com on 18 Jul 2013 at 11:52

GoogleCodeExporter commented 8 years ago
Kneser-Ney smoothing doesn't make sense for a unigram model, unfortunately. I 
should probably do something reasonable when the user requests a unigram model, 
but I'm not sure that's a critical bug that I will find time to fix (this is no 
longer my day job, so I only fix critical bugs). 

Original comment by adpa...@gmail.com on 19 Jul 2013 at 12:02

GoogleCodeExporter commented 8 years ago
FYI: It also crashes when training a bigram model.

Original comment by acgris...@gmail.com on 30 Apr 2014 at 6:35