AnantLabs / berkeleylm

Automatically exported from code.google.com/p/berkeleylm
0 stars 0 forks source link

ArrayIndexOutOfBoundsException when running MakeLmBinaryFromGoogle #2

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
When running MakeLmBinaryFromGoogle I get the exception below (last lines of 
logger output also pasted).
The same exception is thrown if I call readLmFromGoogleNgramDir(path, compress) 
directly with compress set to true.
I could not yet figure out what is going on.
Do you have any clues?

-Torsten

<trace ---------------------------------------------------------->
                Line 13587000
                Line 13588000
            } [1m14s]
        } [1m14s]
        Reading ngrams of order 2 {
Exception in thread "main"      } [0s]
java.lang.ArrayIndexOutOfBoundsException: 1
    at edu.berkeley.nlp.lm.map.CompressedNgramMap.handleNgramsFinished(CompressedNgramMap.java:135)
    at edu.berkeley.nlp.lm.io.NgramMapAddingCallback.handleNgramOrderFinished(NgramMapAddingCallback.java:40)
    at edu.berkeley.nlp.lm.io.GoogleLmReader.parse(GoogleLmReader.java:99)
    at edu.berkeley.nlp.lm.io.GoogleLmReader.parse(GoogleLmReader.java:25)
    at edu.berkeley.nlp.lm.io.LmReaders.buildMapCommon(LmReaders.java:437)
    at edu.berkeley.nlp.lm.io.LmReaders.secondPassGoogle(LmReaders.java:391)
    at edu.berkeley.nlp.lm.io.LmReaders.readLmFromGoogleNgramDir(LmReaders.java:210)
    at edu.berkeley.nlp.lm.io.LmReaders.readLmFromGoogleNgramDir(LmReaders.java:193)
    at de.tudarmstadt.ukp.dkpro.teaching.frequency.berkeleylm.CreateGoogleBinary.run(CreateGoogleBinary.java:25)
    at de.tudarmstadt.ukp.dkpro.teaching.frequency.berkeleylm.CreateGoogleBinary.main(CreateGoogleBinary.java:18)

</trace ---------------------------------------------------------->

Original issue reported on code.google.com by torsten....@gmail.com on 29 Jun 2011 at 8:20

GoogleCodeExporter commented 9 years ago
I am out of the country right now, and it is hard (but not impossible) for me 
to debug. I'm trying to reproduce this right now. Is this on the English web1T 
setup? Have you checked with both versions of the code? I released 1.0b2 in a 
hurry and might have introduced a bug not caught by the tests. Does this 
exception get thrown when you run on the small test example in 

test/edu/berkeley/nlp/lm/io/googledir/

Original comment by adpa...@gmail.com on 29 Jun 2011 at 7:58

GoogleCodeExporter commented 9 years ago

Original comment by adpa...@gmail.com on 29 Jun 2011 at 8:07

GoogleCodeExporter commented 9 years ago
I haven't been able to reproduce this on my end. 

Original comment by adpa...@gmail.com on 30 Jun 2011 at 5:55

GoogleCodeExporter commented 9 years ago
[deleted comment]
GoogleCodeExporter commented 9 years ago
Thanks for the quick answer.

When comparing my google n-gram files with the test example, I found that I had 
unzipped them for use with some other library. After gzipping them again, 
everything now works fine.

Thanks for making the library available,
Torsten

btw: very nice talk at ACL

Original comment by torsten....@gmail.com on 30 Jun 2011 at 9:15

GoogleCodeExporter commented 9 years ago
I'll try to add some more helpful error messages.

Original comment by adpa...@gmail.com on 1 Jul 2011 at 4:18