Open stucker0530 opened 9 years ago
Which dump did you download? You'd want this one:
I'm having the same issue (without the max-docs option). I've tried using both of the dumps that you suggested. I'm on OSx, if that makes any difference. I have also turned sleep off to eliminate that as a possible issue. The bz2 dump you suggested did gave me the highest number of documents successfully processed thus far at 534,792. Any guidance would be appreciated.
I'm using the dump enwiki-20140707-pages-articles.xml.bz2 and so far its working (but only 62k articles in so far).
I am getting the following error when attempting to ingest a local dump of the latest wikipedia. I am running ES 1.7.1 and stream2es 20150720170522978252e
[stream2es]$ ./stream2es wiki --max-docs 5 --source ./enwiki-latest-pages-articles1.xml.bz2 java.io.IOException: unexpected end of stream at org.elasticsearch.river.wikipedia.bzip2.CBZip2InputStream.bsGetBit(CBZip2InputStream.java:371) at org.elasticsearch.river.wikipedia.bzip2.CBZip2InputStream.recvDecodingTables(CBZip2InputStream.java:476) at org.elasticsearch.river.wikipedia.bzip2.CBZip2InputStream.getAndMoveToFrontDecode(CBZip2InputStream.java:550) at org.elasticsearch.river.wikipedia.bzip2.CBZip2InputStream.initBlock(CBZip2InputStream.java:287) at org.elasticsearch.river.wikipedia.bzip2.CBZip2InputStream.init(CBZip2InputStream.java:246) at org.elasticsearch.river.wikipedia.bzip2.CBZip2InputStream.(CBZip2InputStream.java:148)
at org.elasticsearch.river.wikipedia.support.WikiXMLParser.getInputSource(WikiXMLParser.java:80)
at org.elasticsearch.river.wikipedia.support.WikiXMLSAXParser.parse(WikiXMLSAXParser.java:68)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at clojure.lang.Reflector.invokeMatchingMethod(Reflector.java:93)
at clojure.lang.Reflector.invokeNoArgInstanceMember(Reflector.java:313)
at stream2es.stream.wiki$fn6612$fn6613.invoke(wiki.clj:45)
at stream2es.main$streamBANG.invoke(main.clj:241)
at stream2es.main$main.invoke(main.clj:329)
at stream2es.main$_main.doInvoke(main.clj:335)
at clojure.lang.RestFn.applyTo(RestFn.java:137)
at stream2es.main.main(Unknown Source)
2015-09-11T11:13:32.937-0600 ERROR unexpected exception: java.io.IOException: unexpected end of stream
2015-09-11T11:13:33.056-0600 INFO 00:00.208 0.0d/s 0.0K/s (0.0mb) indexed 0 streamed 0 errors 0
[stream2es]$