miku / esbulk

Bulk indexing command line tool for elasticsearch.
GNU General Public License v3.0
278 stars 41 forks source link

indexing failed with 504 Gateway Timeout #22

Closed wmalik closed 6 years ago

wmalik commented 6 years ago

esbulk fails with the following error while indexing a file with around 850k JSON documents:

08:19:47.337 2018/07/06 07:01:38 indexing failed with 504 Gateway Timeout:

In this case, it could make sense to retry the HTTP request with a backoff until elasticsearch returns a 200 response. I noticed that currently the HTTP requests are not retried: https://github.com/miku/esbulk/blob/master/indexing.go#L174

Would it make sense to retry the HTTP requests N times with an exponential back-off?

miku commented 6 years ago

Would it make sense to retry the HTTP requests N times with an exponential back-off?

Yes, sure, sounds good. I've used pester as a drop-in replacement for http.Client in the past and it worked fine. I will try to integrate this into esbulk as well.

wmalik commented 6 years ago

@miku thank you very much for looking into it, and thanks for esbulk in general :)

Please do let me know if I can help with the implementation/testing etc.

miku commented 6 years ago

@wmalik I switched the http client to the default Client from pester, which offers 3 retries with a linear backoff. Not optimal, but a first step.

This change is included in 0.5.1.

miku commented 6 years ago

Closing this, since I assume this is less of a problem with the new pester client.