Open cockroach-teamcity opened 5 days ago
roachtest.c2c/initialscan/kv0 failed with artifacts on master @ eb2d2e19eb29d2747d9e267bd0612a69d066adad:
(soon.go:60).SucceedsWithin: condition failed to evaluate within 30m0s: from cluster_to_cluster.go:1851: no replicated time
(monitor.go:149).Wait: monitor failure: monitor user task failed: t.Fatal() was called
test artifacts and logs in: /artifacts/c2c/initialscan/kv0/run_1
Parameters:
arch=amd64
cloud=aws
coverageBuild=false
cpu=8
encrypted=false
fs=ext4
localSSD=false
metamorphicLeases=default
runtimeAssertionsBuild=false
ssd=0
See: roachtest README
See: How To Investigate (internal)
Grafana is not yet available for aws clusters
roachtest.c2c/initialscan/kv0 failed with artifacts on master @ 5c5c9d6803d47848aa1960dd6642d5f2c1926814:
(soon.go:60).SucceedsWithin: condition failed to evaluate within 30m0s: from cluster_to_cluster.go:1851: no replicated time
(monitor.go:149).Wait: monitor failure: monitor user task failed: t.Fatal() was called
test artifacts and logs in: /artifacts/c2c/initialscan/kv0/cpu_arch=arm64/run_1
Parameters:
arch=arm64
cloud=aws
coverageBuild=false
cpu=8
encrypted=false
fs=ext4
localSSD=false
metamorphicLeases=default
runtimeAssertionsBuild=false
ssd=0
See: roachtest README
See: How To Investigate (internal)
Grafana is not yet available for aws clusters
@dt this began regressing after https://github.com/cockroachdb/cockroach/pull/135637 landed. Perhaps we need to round robin spans after all.
Perhaps we need to round robin spans after all.
This test has so few spans in it that round-robin vs first-k seems irrelevant; the procs all have just two or even only one span each.
I'm guessing the change in timing here is because I now take the dest node count into consideration when picking number of processors we'll run, dividing the number of spans by the number of nodes using integer division so we round down. If you have thousands of spans in some production scale case, the remainder when dividing by the node count is an inconsequential rounding error, but in this case, we have so few spans that that rounding error may be a non-trivial fraction (of a trivial number). Note that we now have 5 procs per node, and some have two spans while most have one. I'm guessing that before the rounding, we got 8 procs per node with exactly one span per proc, so for this extreme edge case -- one span per proc -- we doubled the work for some procs when we gave them a single extra span due to rounding changes.
I'll poke a bit and see if we want to make the division try harder to get all 8 procs even when the span count is tiny.
roachtest.c2c/initialscan/kv0 failed with artifacts on master @ cea3ff5562160a3bf2802da052da2aaa40e1ccc1:
(soon.go:60).SucceedsWithin: condition failed to evaluate within 30m0s: from cluster_to_cluster.go:1851: no replicated time
(monitor.go:149).Wait: monitor failure: monitor user task failed: t.Fatal() was called
test artifacts and logs in: /artifacts/c2c/initialscan/kv0/cpu_arch=arm64/run_1
Parameters:
arch=arm64
cloud=aws
coverageBuild=false
cpu=8
encrypted=false
fs=ext4
localSSD=false
metamorphicLeases=leader
runtimeAssertionsBuild=false
ssd=0
See: roachtest README
See: How To Investigate (internal)
Grafana is not yet available for aws clusters
roachtest.c2c/initialscan/kv0 failed with artifacts on master @ f717f6bd218121bb5e3376af658545f6bff30c22:
(soon.go:60).SucceedsWithin: condition failed to evaluate within 30m0s: from cluster_to_cluster.go:1851: no replicated time
(monitor.go:149).Wait: monitor failure: monitor user task failed: t.Fatal() was called
test artifacts and logs in: /artifacts/c2c/initialscan/kv0/cpu_arch=arm64/run_1
Parameters:
arch=arm64
cloud=aws
coverageBuild=false
cpu=8
encrypted=false
fs=ext4
localSSD=false
metamorphicLeases=leader
runtimeAssertionsBuild=false
ssd=0
See: roachtest README
See: How To Investigate (internal)
Grafana is not yet available for aws clusters
roachtest.c2c/initialscan/kv0 failed with artifacts on master @ f717f6bd218121bb5e3376af658545f6bff30c22:
(soon.go:60).SucceedsWithin: condition failed to evaluate within 30m0s: from cluster_to_cluster.go:1851: no replicated time
(monitor.go:149).Wait: monitor failure: monitor user task failed: t.Fatal() was called
test artifacts and logs in: /artifacts/c2c/initialscan/kv0/cpu_arch=arm64/run_1
Parameters:
arch=arm64
cloud=aws
coverageBuild=false
cpu=8
encrypted=false
fs=ext4
localSSD=false
metamorphicLeases=default
runtimeAssertionsBuild=false
ssd=0
See: roachtest README
See: How To Investigate (internal)
Grafana is not yet available for aws clusters
- #136091 roachtest: c2c/initialscan/kv0 failed [A-disaster-recovery C-test-failure O-roachtest O-robot T-disaster-recovery branch-release-24.3 release-blocker]
roachtest.c2c/initialscan/kv0 failed with artifacts on master @ 8eeb7f2ae3b2cede564b46ca47e2353fd147c061:
Parameters:
arch=amd64
cloud=aws
coverageBuild=false
cpu=8
encrypted=false
fs=ext4
localSSD=false
metamorphicLeases=leader
runtimeAssertionsBuild=false
ssd=0
Help
See: roachtest README
See: How To Investigate (internal)
Grafana is not yet available for aws clusters
/cc @cockroachdb/disaster-recoveryThis test on roachdash | Improve this report!
Jira issue: CRDB-44712