Closed ty-elastic closed 4 months ago
alternatively, you could perhaps set request_context.request_end
the first time you set request_context.request_start
, since obviously the intent is that request_end
can be continually updated.
nm. this is correctly addressed here, I believe.
Under "high" rally load (~10Gbit/s using the elastic/logs track with 1000 bulk indexing clients), I can consistently get Rally to crash within some minutes. Looking at the logs, it seems to happen when the target ES momentarily cannot accommodate the load (e.g., returns 429s). The logs prior to the exception will be like:
thereafter, Rally will stop with an exception:
Looking at a stack trace, this exception is generated because:
where
request_end
is unexpectedly None.Looking at the code, I think the intent is that
request_context.request_end
should always be set, even on error scenarios. Perhaps there is some race condition with the HTTP request lib, though, where for some reasonon_request_end()
is never called, leavingrequest_context.request_end
unset.This PR doesn't try to address the root cause (why is
request_context.request_end
unset), but rather acknowledges that it is unset (only in already acknowledged error scenarios, i.e.,request_meta_data["success"] == False
), and sets it. My assumption is that becauserequest_meta_data["success"] == False
and because it can be False at this point (with request_end set to something), that some other code down the line will handle counting the error and what we need to focus on here is just not stopping overall execution (due to an unplanned exception) on an otherwise non-fatal error..rally_log.txt rally_config.txt