Open cbuescher opened 1 year ago
Pinging @elastic/es-distributed (Team:Distributed)
Relabel this to lwo-risk since it is an off-by-one error in metric number comparison which is not a critical path.
I think this is likely a duplicate of https://github.com/elastic/elasticsearch/issues/88841
Improved the test output on failure in #102386 and #102387, now waiting on another failure to confirm.
Unfortunately the AWS debug logging was disabled due to #105020. I raised #109068 to reenable it. I'll ask core-infra whether it is possible to skip logger checking for tests.
It still has not failed yet since May 28.
It still has not failed yet since May 28.
Perhaps because it has been muted since then, see 520a1599a65301c0cac44afe1ea306d3f718416f 🤦 I opened https://github.com/elastic/elasticsearch/pull/114129 to start running the test again.
I've been running this test over the past couple of days with stress-ng on and off randomly. over 20k+ runs and no failure. IMO, we can close it since it doesn't reproduce.
I also couldn't reproduce it on repeated but it was failing very rarely in CI even before we muted it. I still think it's an issue tho.
Pinging @elastic/es-distributed-coordination (Team:Distributed Coordination)
Build scan: https://gradle-enterprise.elastic.co/s/cmzydsjar4s3c/tests/:modules:repository-s3:internalClusterTest/org.elasticsearch.repositories.s3.S3BlobStoreRepositoryTests/testMetrics Reproduction line:
Applicable branches: main
Reproduces locally?: No
Failure history: https://gradle-enterprise.elastic.co/scans/tests?tests.container=org.elasticsearch.repositories.s3.S3BlobStoreRepositoryTests&tests.test=testMetrics
Failure excerpt: