cockroachdb / cockroach

CockroachDB - the open source, cloud-native distributed SQL database.
https://www.cockroachlabs.com
Other
29.61k stars 3.71k forks source link

roachtest: c2c/tpcc/warehouses=500/duration=10/cutover=0 failed [azure failure] #125752

Open cockroach-teamcity opened 1 month ago

cockroach-teamcity commented 1 month ago

roachtest.c2c/tpcc/warehouses=500/duration=10/cutover=0 failed with artifacts on master @ 3aa5bdf40527e2f60b179094403b4302a1c2cbe1:

(soon.go:64).SucceedsWithin: condition failed to evaluate within 30m0s: from cluster_to_cluster.go:690: no replicated time
(monitor.go:154).Wait: monitor failure: monitor user task failed: t.Fatal() was called
test artifacts and logs in: /artifacts/c2c/tpcc/warehouses=500/duration=10/cutover=0/run_1

Parameters:

See: roachtest README

See: How To Investigate (internal)

Grafana is not yet available for azure clusters

/cc @cockroachdb/disaster-recovery

This test on roachdash | Improve this report!

Jira issue: CRDB-39592

msbutler commented 1 month ago

i'm going disable these tests on azure.

msbutler commented 1 month ago

this is actually the replanning regression documented here.

❯ grep "stream_ingestion_job" *unredacted/cockroach.log | grep "hit retryable error" | grep "node frontier" | wc -l
     108

But before the regression was introduced, test history indicates that azure roachtests were running much slower than on other clouds. so i'm going to keep pcr tests skipped on azure for now.