cockroachdb / cockroach

CockroachDB — the cloud native, distributed SQL database designed for high availability, effortless scale, and control over data placement.
https://www.cockroachlabs.com
Other
30.16k stars 3.82k forks source link

roachtest: follower-reads/survival=zone/locality=global/reads=exact-staleness failed #135068

Open cockroach-teamcity opened 1 week ago

cockroach-teamcity commented 1 week ago

Note: This build has runtime assertions enabled. If the same failure was hit in a run without assertions enabled, there should be a similar failure without this message. If there isn't one, then this failure is likely due to an assertion violation or (assertion) timeout.

roachtest.follower-reads/survival=zone/locality=global/reads=exact-staleness failed with artifacts on release-24.3 @ 7ad4a90721e315e728bf7a9b3e5ae4f78d53ebca:

(follower_reads.go:742).verifySQLLatency: Post "https://34.73.4.36:26258/ts/query": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
test artifacts and logs in: /artifacts/follower-reads/survival=zone/locality=global/reads=exact-staleness/run_1

Parameters:

See: roachtest README

See: How To Investigate (internal)

See: Grafana

/cc @cockroachdb/kv-triage

This test on roachdash | Improve this report!

Jira issue: CRDB-44350

arulajmani commented 1 week ago

Seems like another instance of https://github.com/cockroachdb/cockroach/issues/134047.

blathers-crl[bot] commented 1 week ago

cc @cockroachdb/test-eng

DarrylWong commented 1 week ago

I can bump the default http timeout to say 10 seconds, but I just want to point out that this test is using a 3 second timeout (as opposed to the 500 ms #134047 used) and does not experience the same cluster overload symptoms. CPU is under 10% and latency is fairly low. I could be missing some context on what the test is doing though.