elastic / stream2es

Stream data into ES (Wikipedia, Twitter, stdin, or other ESes)
355 stars 60 forks source link

Failing .. status: 500 #35

Closed rahst12 closed 10 years ago

rahst12 commented 10 years ago

Hi, I'm getting an error when trying to copy a large index to another (~800,000,000) documents. We've gotten it twice, and it crashes stream2es.

I've included below some of the information with the error. We are curious if when we restart stream2es, if it'll try to copy all the same documents over again (duplicate documents in the 2nd index), or if it's smart about what it copies?

Thanks

Details:

{:throwable #<ExceptionInfo: clj-http:status 500 {:object {:trace-redirects.......}

Reason: remoteTransportException[[server_hostname]][inet[/ip:9300]][search/phase/scan/scroll]]; nested: SearchContextMissingException[No search context found for id [139746]];.. max_score 0.0 hits: []

at clj_http.client$wrap_exceptions$fn1034.invoke(client.clj:111) at clj_http.client$wrap_accept$fn1154.invoke(client.clj:380) at clj_http.client$wrap_accept_encoding$fn1160.invoke(client.clj:394) at clj_http.client$wrap_content_type$fn__1149.invoke(client.clj:370) at clj_http.client$wrap_form_param$fn1198.invoke(client.clj:481) at clj_http.client$wrap_param$fn1216.invoke(client.clj:505) at clj_http.client$wrap_method$fn1193.invoke(client.clj.464) at clj_http.client$wrap_cookies$fn645.invoke(cookies.clj:118) at clj_http.client$wrap_link$fn675.invoke(links.clj:50) at clj_http.client$wrap_unknown_host$fn1225.invoke(client.clj:524) at clj_http.client$get.doInvoke(client.clj:615) at clojure.lang.RestFn.invoke(RestFn.java:423) at stream2es.es$scrollSTAR.invoke(es.clj:64) at stream2es.es$scroll.invoke(es.clj:72) at stream2es.es$scroll$fn__1307.invoke(es.clj:77) at clojure.lang.LazySeq.sval(LazySeq.java:42) at clojure.lang.LazySeq.seq(LazySeq.java:67) at clojure.lang.RT.seq(RT.java:484) at clojure.core$seq.invoke(core.clj:133) at stream2es.stream.es$make_callback$fn1904.invoke(es.clj:68) at stream2es.main$streamBANG.invoke(main.clj:291) at stream2es.main$main.invoke(main.clj:417) at stream2es.main$_main.doInvoke(main.clj:435) at clojure.lang.RestFn.applyTo(RestFn.java:137) at stream2es.main.main(Unknown Source)

drewr commented 10 years ago

This means that your scroll closed before you returned. Depending on how busy your cluster is, or if you increased --scroll-size, ES may not be able to return your scroll response within --scroll-time. So, you should increase --scroll-time.

Something I've wanted to do is dynamically set --scroll-time based on how long the scroll request takes, but until then I think I'll just make the error above more helpful.

drewr commented 10 years ago

It's now gonna look something like this:

01:51.465 1737.4d/s 1238.8K/s 193658 1490 1087546 0 AUi29NKt0iGFhozPkUHJ
01:51.806 1745.4d/s 1244.5K/s 195148 1490 1087729 0 AUi29NZ_0iGFhozPkU1u
01:52.177 1752.9d/s 1249.9K/s 196638 1490 1087714 0 AUi29NsL0iGFhozPkVkT
01:52.575 1760.0d/s 1254.9K/s 198128 1490 1087397 0 AUi29OUP0iGFhozPkXM1
stream terminated:

The search scroll is closing before stream2es is able to return and
get another batch of hits. This typically means that ES is under
pressure on one side or the other.

Try either increasing --scroll-time or decreasing --scroll-size.

01:52.961 1767.1d/s 1260.0K/s 199618 1490 1087627 0 AUi29N3N0iGFhozPkWCc
streamed 200000 indexed 199618 bytes xfer 145746136 errors 0