Closed cachedout closed 3 years ago
(Assigned to @axw just to get the ball rolling but feel free to re-assign as appropriate.)
I've been kicking this issue-can down the road as we've had more pressing matters. It seems now that they're stable -- I don't see any failures in the past 20 days.
We're revisiting benchmarking in the 8.0 timeframe, with a view to replace the current hey-apm benchmarks: https://github.com/elastic/apm-server/issues/6549.
Given that, I'm reluctant to spend time on digging into this unless it's causing toil for the productivity team. @cachedout please let us know if that's the case. If not, I'll close this and we'll focus efforts on coming up with something more reliable.
Given that, I'm reluctant to spend time on digging into this unless it's causing toil for the productivity team.
I agree that it's OK to close this. Thanks, @axw
The benchmark tests have begun to fail.
In looking at these cases, it appears that the hey-apm container is exiting with error code 1:
https://apm-ci.elastic.co/blue/organizations/jenkins/apm-server%2Fapm-hey-test-benchmark/detail/apm-hey-test-benchmark/832/pipeline#step-163-log-560
For the above failure, the output from the containers can be found here: https://apm-ci.elastic.co/job/apm-server/job/apm-hey-test-benchmark/832/artifact/src/github.com/elastic/hey-apm/build/environment.txt
These tests are running somewhat close to the 1hr timeout which is set for the pipeline but I don't think that's triggering the container exit.
Additionally, in looking at the logs it appears there are a large number of transactions which are getting dropped. I don't know if that's expected or not but wanted to mention it here as well.