marcusklang / wikiforia

A Utility Library for Wikipedia dumps
GNU General Public License v2.0
33 stars 15 forks source link

Exception thrown in main #10

Open ghost opened 7 years ago

ghost commented 7 years ago

I tried

java -jar wikiforia-1.2.1.jar -pages /home/sudeshna/wikiforia-master/enwiki-20161201-pages-articles-multistream.xml.bz2 -output /home/sudeshna/ -outputformat plain-text

I am getting this exception

Exception in thread "main" java.io.IOError: java.io.IOException: unexpected end of stream
    at se.lth.cs.nlp.mediawiki.parser.MultistreamBzip2XmlDumpParser$PageReader.<init>(MultistreamBzip2XmlDumpParser.java:213)
    at se.lth.cs.nlp.mediawiki.parser.MultistreamBzip2XmlDumpParser.<init>(MultistreamBzip2XmlDumpParser.java:107)
    at se.lth.cs.nlp.wikiforia.App.convert(App.java:303)
    at se.lth.cs.nlp.wikiforia.App.main(App.java:488)
Caused by: java.io.IOException: unexpected end of stream
    at org.apache.commons.compress.compressors.bzip2.BZip2CompressorInputStream.bsGetBit(BZip2CompressorInputStream.java:398)
    at org.apache.commons.compress.compressors.bzip2.BZip2CompressorInputStream.recvDecodingTables(BZip2CompressorInputStream.java:499)
    at org.apache.commons.compress.compressors.bzip2.BZip2CompressorInputStream.getAndMoveToFrontDecode(BZip2CompressorInputStream.java:573)
    at org.apache.commons.compress.compressors.bzip2.BZip2CompressorInputStream.initBlock(BZip2CompressorInputStream.java:311)
    at org.apache.commons.compress.compressors.bzip2.BZip2CompressorInputStream.<init>(BZip2CompressorInputStream.java:133)
    at org.apache.commons.compress.compressors.bzip2.BZip2CompressorInputStream.<init>(BZip2CompressorInputStream.java:109)
    at se.lth.cs.nlp.mediawiki.parser.MultistreamBzip2XmlDumpParser$PageReader.readHeader(MultistreamBzip2XmlDumpParser.java:237)
    at se.lth.cs.nlp.mediawiki.parser.MultistreamBzip2XmlDumpParser$PageReader.<init>(MultistreamBzip2XmlDumpParser.java:211)
    ... 3 more

What am I doing wrong?

Thanks!