elastic / stream2es

Stream data into ES (Wikipedia, Twitter, stdin, or other ESes)
355 stars 60 forks source link

Code added to make stream death more pleasant obscures OOM exception #40

Closed jonahbull closed 6 years ago

jonahbull commented 9 years ago

I ran into this issue trying to stream a not-terribly big index (~6GB) with an average document size of 2-3MB. stream2es version info:

./stream2es --version
2015-01-03T20:27:48.473-0700 INFO  stream2es 2014122282ace27

And here's a quick log snippet that illustrates the issue:

./stream2es es --scroll-size 50 --scroll-time 5m --source http://localhost:9200/myindex --target http://localhost:9200/myindex_v1 --log trace
2015-01-02T18:10:37.134-0700 DEBUG create index http://localhost:9200/myindex_v1
2015-01-02T18:10:37.057-0700 TRACE waiting for collectors
2015-01-02T18:10:37.278-0700 TRACE PUT 317021 bytes
2015-01-02T18:10:38.830-0700 DEBUG stream es from http://localhost:9200/myindex to http://localhost:9200/myindex_v1
java.lang.NullPointerException
        at java.util.regex.Matcher.getTextLength(Matcher.java:1234)
        at java.util.regex.Matcher.reset(Matcher.java:308)
        at java.util.regex.Matcher.<init>(Matcher.java:228)
        at java.util.regex.Pattern.matcher(Pattern.java:1088)
        at clojure.core$re_matcher.invoke(core.clj:4386)
        at clojure.core$re_find.invoke(core.clj:4438)
        at stream2es.es$scroll_STAR_.invoke(es.clj:72)
        at stream2es.es$scroll.invoke(es.clj:79)
        at stream2es.es$scan.invoke(es.clj:103)
        at stream2es.stream.es$make_callback$fn__3008.invoke(es.clj:75)
        at stream2es.main$stream_BANG_.invoke(main.clj:241)
        at stream2es.main$main.invoke(main.clj:330)
        at stream2es.main$_main.doInvoke(main.clj:336)
        at clojure.lang.RestFn.applyTo(RestFn.java:137)
        at stream2es.main.main(Unknown Source)
2015-01-02T18:13:44.319-0700 ERROR unexpected exception: java.lang.NullPointerException
2015-01-02T18:13:44.468-0700 INFO  03:07.409 0.0d/s 0.0K/s (0.0mb) indexed 0 streamed 0 errors 0

After some investigation, I figured out that body was nil in scroll*, which of course caused the NPE trying to call re-find. I hacked in some more logging and the reason body was nil was that stream2es was OOMing (java.lang.OutOfMemoryError: GC overhead limit exceeded) trying to parse the first set of returned documents from the scroll. Lowering the scroll size, as suggested in #24, fixed that.

Long story short, it'd be nice if the exception handling, er, handled, this particular case, if only to save people time trying to figure out the real issue behind the NPE. My simple hack around this was just to add a nil check for body to the cond before the search-context-missing bits, and then throw the exception if body is nil, but my Clojure is rudimentary at best.

drewr commented 9 years ago

Thanks for the report @jonahbull. OOM is a tricky situation in the JVM. Let me think about it...

mcantrell commented 9 years ago

I'm not entirely sure if it's masking a OOM exception. I'm getting the same error with --scroll-size 1

jonahbull commented 9 years ago

@mcantrell It was in my case, but I think any exception that caused body to be nill in scroll* would produce this behavior.

jonahbull commented 6 years ago

Well this has aged like a fine wine. We ending up switching to the reindexing API some years ago. Going to close this out, @drewr 😀