logzio / elasticsearch-stress-test

Stress test tool for Elasticsearch
Apache License 2.0
271 stars 110 forks source link

Default timeout settings cause ConnectionErrors #11

Closed danielmitterdorfer closed 7 years ago

danielmitterdorfer commented 7 years ago

Problem

When a bulk request is taking longer than the default timeout of the Python Elasticsearch client, the script is recording an error and moving on with the next request. The problem is that the server is likely still processing the request. Thus test script is actually throwing even more load at an already overloaded node.

Steps to reproduce

  1. Start an Elasticsearch node - say 5.5.2 - with out-of-the-box settings on localhost
  2. Run python elasticsearch-stress-test.py --es_address localhost --documents 10 --clients 10 --seconds 120 --indices 5 --no-cleanup --not-green

The script will produce failures due to read timeouts. If you insert a print in the try-except you'll see these:

ConnectionTimeout caused by - ReadTimeoutError(HTTPConnectionPool(host=u'localhost', port=9200): Read timed out. (read timeout=10))

Proposed solution

10 seconds is not an insanely long time period if you are hitting a node with default settings with large bulk requests. So I suggest to increase the timeout to e.g. 60 seconds by creating the Elasticsearch client with:

es = Elasticsearch(esaddress, timeout=60)

instead of

es = Elasticsearch(esaddress)
roiravhon commented 7 years ago

Thanks @danielmitterdorfer! Submit a PR? looks like a completely valid change

danielmitterdorfer commented 7 years ago

Sure, I can do that, the change is trivial. I just wanted to raise an issue to discuss before-hand.

danielmitterdorfer commented 7 years ago

I've submitted #12. Can you please have a look? :)

roiravhon commented 7 years ago

Thanks @danielmitterdorfer!