cockroachdb / cockroach

CockroachDB — the cloud native, distributed SQL database designed for high availability, effortless scale, and control over data placement.
https://www.cockroachlabs.com
Other
29.89k stars 3.77k forks source link

roachtest: follower-reads/mixed-version/survival=region/locality=global/reads=strong failed #129167

Open cockroach-teamcity opened 1 month ago

cockroach-teamcity commented 1 month ago

roachtest.follower-reads/mixed-version/survival=region/locality=global/reads=strong failed with artifacts on master @ 1993fc04b5116f20f4814d637c7ce87b003687e4:

(follower_reads.go:873).verifyHighFollowerReadRatios: too many intervals with more than 2 nodes with low follower read ratios: 23 intervals > 4 threshold. Bad intervals:
interval 09:12:20-09:12:30: n1 ratio: 1.955 n2 ratio: 0.000 n3 ratio: 0.000 n4 ratio: 0.000 n5 ratio: 1.995 n6 ratio: 1.000 
interval 09:12:30-09:12:40: n1 ratio: 1.956 n2 ratio: 0.000 n3 ratio: 0.000 n4 ratio: 0.000 n5 ratio: 1.983 n6 ratio: 1.000 
interval 09:12:40-09:12:50: n1 ratio: 1.961 n2 ratio: 0.000 n3 ratio: 0.000 n4 ratio: 0.000 n5 ratio: 1.998 n6 ratio: 1.000 
interval 09:12:50-09:13:00: n1 ratio: 1.957 n2 ratio: 0.000 n3 ratio: 0.000 n4 ratio: 0.000 n5 ratio: 1.986 n6 ratio: 1.000 
interval 09:13:00-09:13:10: n1 ratio: 1.956 n2 ratio: 0.000 n3 ratio: 0.000 n4 ratio: 0.000 n5 ratio: 1.995 n6 ratio: 1.000 
interval 09:13:10-09:13:20: n1 ratio: 1.961 n2 ratio: 0.000 n3 ratio: 0.000 n4 ratio: 0.000 n5 ratio: 1.988 n6 ratio: 1.000 
interval 09:13:20-09:13:30: n1 ratio: 1.962 n2 ratio: 0.000 n3 ratio: 0.000 n4 ratio: 0.000 n5 ratio: 1.988 n6 ratio: 1.000 
interval 09:13:30-09:13:40: n1 ratio: 1.966 n2 ratio: 0.000 n3 ratio: 0.000 n4 ratio: 0.000 n5 ratio: 1.998 n6 ratio: 1.000 
interval 09:13:40-09:13:50: n1 ratio: 1.970 n2 ratio: 0.000 n3 ratio: 0.000 n4 ratio: 0.000 n5 ratio: 1.993 n6 ratio: 1.000 
interval 09:13:50-09:14:00: n1 ratio: 1.973 n2 ratio: 0.000 n3 ratio: 0.000 n4 ratio: 0.000 n5 ratio: 1.983 n6 ratio: 1.000 
interval 09:14:00-09:14:10: n1 ratio: 1.973 n2 ratio: 0.000 n3 ratio: 0.000 n4 ratio: 0.000 n5 ratio: 1.993 n6 ratio: 1.000 
interval 09:14:10-09:14:20: n1 ratio: 1.968 n2 ratio: 0.000 n3 ratio: 0.000 n4 ratio: 0.000 n5 ratio: 1.993 n6 ratio: 1.000 
interval 09:14:20-09:14:30: n1 ratio: 1.965 n2 ratio: 0.000 n3 ratio: 0.000 n4 ratio: 0.000 n5 ratio: 1.993 n6 ratio: 1.000 
interval 09:14:30-09:14:40: n1 ratio: 1.966 n2 ratio: 0.000 n3 ratio: 0.000 n4 ratio: 0.000 n5 ratio: 1.983 n6 ratio: 1.000 
interval 09:14:40-09:14:50: n1 ratio: 1.974 n2 ratio: 0.000 n3 ratio: 0.000 n4 ratio: 0.000 n5 ratio: 1.990 n6 ratio: 1.000 
interval 09:14:50-09:15:00: n1 ratio: 1.975 n2 ratio: 0.000 n3 ratio: 0.000 n4 ratio: 0.000 n5 ratio: 1.998 n6 ratio: 1.000 
interval 09:15:00-09:15:10: n1 ratio: 1.976 n2 ratio: 0.000 n3 ratio: 0.000 n4 ratio: 0.000 n5 ratio: 1.981 n6 ratio: 1.000 
interval 09:15:10-09:15:20: n1 ratio: 1.974 n2 ratio: 0.000 n3 ratio: 0.000 n4 ratio: 0.000 n5 ratio: 1.998 n6 ratio: 1.000 
interval 09:15:20-09:15:30: n1 ratio: 1.977 n2 ratio: 0.000 n3 ratio: 0.000 n4 ratio: 0.000 n5 ratio: 1.986 n6 ratio: 1.000 
interval 09:15:30-09:15:40: n1 ratio: 1.968 n2 ratio: 0.000 n3 ratio: 0.000 n4 ratio: 0.000 n5 ratio: 2.002 n6 ratio: 1.000 
interval 09:15:40-09:15:50: n1 ratio: 1.968 n2 ratio: 0.000 n3 ratio: 0.000 n4 ratio: 0.000 n5 ratio: 1.990 n6 ratio: 1.000 
interval 09:15:50-09:16:00: n1 ratio: 1.972 n2 ratio: 0.000 n3 ratio: 0.000 n4 ratio: 0.000 n5 ratio: 1.988 n6 ratio: 1.000 
interval 09:16:00-09:16:10: n1 ratio: 1.971 n2 ratio: 0.000 n3 ratio: 0.000 n4 ratio: 0.000 n5 ratio: 1.998 n6 ratio: 1.000 
(mixedversion.go:695).Run: panic (stack trace above): t.Fatal() was called
test artifacts and logs in: /artifacts/follower-reads/mixed-version/survival=region/locality=global/reads=strong/run_1

Parameters:

See: roachtest README

See: How To Investigate (internal)

See: Grafana

Same failure on other branches

- #129031 roachtest: follower-reads/mixed-version/survival=region/locality=global/reads=strong failed [C-test-failure O-roachtest O-robot T-kv branch-release-24.2 release-blocker] - #128899 roachtest: follower-reads/mixed-version/survival=region/locality=global/reads=strong failed [C-test-failure O-roachtest O-robot T-kv branch-release-23.2 release-blocker]

/cc @cockroachdb/kv-triage

This test on roachdash | Improve this report!

Jira issue: CRDB-41411

cockroach-teamcity commented 1 month ago

roachtest.follower-reads/mixed-version/survival=region/locality=global/reads=strong failed with artifacts on master @ 1993fc04b5116f20f4814d637c7ce87b003687e4:

(follower_reads.go:873).verifyHighFollowerReadRatios: too many intervals with more than 2 nodes with low follower read ratios: 23 intervals > 4 threshold. Bad intervals:
interval 10:41:50-10:42:00: n1 ratio: 0.000 n2 ratio: 3.593 n3 ratio: 1.000 n4 ratio: 0.000 n5 ratio: 1.000 n6 ratio: 0.000 
interval 10:42:00-10:42:10: n1 ratio: 0.000 n2 ratio: 3.571 n3 ratio: 1.000 n4 ratio: 0.000 n5 ratio: 1.000 n6 ratio: 0.001 
interval 10:42:10-10:42:20: n1 ratio: 0.000 n2 ratio: 3.587 n3 ratio: 1.000 n4 ratio: 0.000 n5 ratio: 1.001 n6 ratio: 0.000 
interval 10:42:20-10:42:30: n1 ratio: 0.000 n2 ratio: 3.582 n3 ratio: 1.000 n4 ratio: 0.000 n5 ratio: 1.002 n6 ratio: 0.000 
interval 10:42:30-10:42:40: n1 ratio: 0.000 n2 ratio: 3.597 n3 ratio: 1.000 n4 ratio: 0.000 n5 ratio: 1.001 n6 ratio: 0.001 
interval 10:42:40-10:42:50: n1 ratio: 0.000 n2 ratio: 3.572 n3 ratio: 1.000 n4 ratio: 0.000 n5 ratio: 1.000 n6 ratio: 0.000 
interval 10:42:50-10:43:00: n1 ratio: 0.000 n2 ratio: 3.593 n3 ratio: 1.000 n4 ratio: 0.000 n5 ratio: 1.000 n6 ratio: 0.000 
interval 10:43:00-10:43:10: n1 ratio: 0.000 n2 ratio: 3.578 n3 ratio: 1.000 n4 ratio: 0.000 n5 ratio: 1.000 n6 ratio: 0.000 
interval 10:43:10-10:43:20: n1 ratio: 0.000 n2 ratio: 3.597 n3 ratio: 1.000 n4 ratio: 0.000 n5 ratio: 1.000 n6 ratio: 0.000 
interval 10:43:20-10:43:30: n1 ratio: 0.000 n2 ratio: 3.586 n3 ratio: 1.000 n4 ratio: 0.000 n5 ratio: 1.000 n6 ratio: 0.000 
interval 10:43:30-10:43:40: n1 ratio: 0.000 n2 ratio: 3.565 n3 ratio: 1.000 n4 ratio: 0.000 n5 ratio: 1.000 n6 ratio: 0.000 
interval 10:43:40-10:43:50: n1 ratio: 0.000 n2 ratio: 3.578 n3 ratio: 1.000 n4 ratio: 0.000 n5 ratio: 1.000 n6 ratio: 0.000 
interval 10:43:50-10:44:00: n1 ratio: 0.000 n2 ratio: 3.603 n3 ratio: 1.000 n4 ratio: 0.000 n5 ratio: 1.000 n6 ratio: 0.000 
interval 10:44:00-10:44:10: n1 ratio: 0.000 n2 ratio: 3.576 n3 ratio: 1.000 n4 ratio: 0.000 n5 ratio: 1.000 n6 ratio: 0.001 
interval 10:44:10-10:44:20: n1 ratio: 0.000 n2 ratio: 3.578 n3 ratio: 1.000 n4 ratio: 0.000 n5 ratio: 1.000 n6 ratio: 0.000 
interval 10:44:20-10:44:30: n1 ratio: 0.000 n2 ratio: 3.576 n3 ratio: 1.000 n4 ratio: 0.000 n5 ratio: 1.000 n6 ratio: 0.000 
interval 10:44:30-10:44:40: n1 ratio: 0.000 n2 ratio: 3.603 n3 ratio: 1.000 n4 ratio: 0.000 n5 ratio: 1.000 n6 ratio: 0.000 
interval 10:44:40-10:44:50: n1 ratio: 0.000 n2 ratio: 3.571 n3 ratio: 1.000 n4 ratio: 0.000 n5 ratio: 1.000 n6 ratio: 0.000 
interval 10:44:50-10:45:00: n1 ratio: 0.000 n2 ratio: 3.593 n3 ratio: 1.000 n4 ratio: 0.000 n5 ratio: 1.001 n6 ratio: 0.000 
interval 10:45:00-10:45:10: n1 ratio: 0.000 n2 ratio: 3.575 n3 ratio: 1.000 n4 ratio: 0.000 n5 ratio: 0.999 n6 ratio: 0.000 
interval 10:45:10-10:45:20: n1 ratio: 0.000 n2 ratio: 3.577 n3 ratio: 1.000 n4 ratio: 0.000 n5 ratio: 1.000 n6 ratio: 0.000 
interval 10:45:20-10:45:30: n1 ratio: 0.000 n2 ratio: 3.586 n3 ratio: 1.000 n4 ratio: 0.000 n5 ratio: 1.000 n6 ratio: 0.000 
interval 10:45:30-10:45:40: n1 ratio: 0.000 n2 ratio: 3.565 n3 ratio: 1.000 n4 ratio: 0.000 n5 ratio: 1.000 n6 ratio: 0.000 
(mixedversion.go:695).Run: panic (stack trace above): t.Fatal() was called
test artifacts and logs in: /artifacts/follower-reads/mixed-version/survival=region/locality=global/reads=strong/run_1

Parameters:

See: roachtest README

See: How To Investigate (internal)

See: Grafana

Same failure on other branches

- #129192 roachtest: follower-reads/mixed-version/survival=region/locality=global/reads=strong failed [C-test-failure O-roachtest O-robot T-kv branch-release-24.1 release-blocker] - #129031 roachtest: follower-reads/mixed-version/survival=region/locality=global/reads=strong failed [C-test-failure O-roachtest O-robot T-kv branch-release-24.2 release-blocker] - #128899 roachtest: follower-reads/mixed-version/survival=region/locality=global/reads=strong failed [C-test-failure O-roachtest O-robot T-kv branch-release-23.2 release-blocker]

This test on roachdash | Improve this report!

kvoli commented 1 month ago

Duplicate of https://github.com/cockroachdb/cockroach/issues/129031

renatolabs commented 3 weeks ago

Keeping this as the main issue for the follower-reads failure, since it's the branch-master issue, the only branch where this failure could happen again (other than frozen rc branches, but we can close those manually while they exist).

For the record, there's useful some investigation on #129031.

cockroach-teamcity commented 3 days ago

roachtest.follower-reads/mixed-version/survival=region/locality=global/reads=strong failed with artifacts on master @ 3f8226eb7f09eb93c26dda4cd47e023a8f56ea23:

(follower_reads.go:873).verifyHighFollowerReadRatios: too many intervals with more than 2 nodes with low follower read ratios: 23 intervals > 4 threshold. Bad intervals:
interval 11:41:10-11:41:20: n1 ratio: 0.001 n2 ratio: 0.005 n3 ratio: 1.988 n4 ratio: 0.997 n5 ratio: 0.123 n6 ratio: 0.979 
interval 11:41:20-11:41:30: n1 ratio: 0.001 n2 ratio: 0.005 n3 ratio: 1.951 n4 ratio: 1.000 n5 ratio: 1.793 n6 ratio: 0.000 
interval 11:41:30-11:41:40: n1 ratio: 0.001 n2 ratio: 0.002 n3 ratio: 1.984 n4 ratio: 1.000 n5 ratio: 1.790 n6 ratio: 0.000 
interval 11:41:40-11:41:50: n1 ratio: 0.001 n2 ratio: 0.005 n3 ratio: 1.980 n4 ratio: 1.000 n5 ratio: 1.795 n6 ratio: 0.000 
interval 11:41:50-11:42:00: n1 ratio: 0.001 n2 ratio: 0.000 n3 ratio: 1.982 n4 ratio: 1.000 n5 ratio: 1.793 n6 ratio: 0.000 
interval 11:42:00-11:42:10: n1 ratio: 0.000 n2 ratio: 0.000 n3 ratio: 1.993 n4 ratio: 1.003 n5 ratio: 1.785 n6 ratio: 0.000 
interval 11:42:10-11:42:20: n1 ratio: 0.001 n2 ratio: 0.000 n3 ratio: 1.984 n4 ratio: 1.003 n5 ratio: 1.790 n6 ratio: 0.000 
interval 11:42:20-11:42:30: n1 ratio: 0.001 n2 ratio: 0.000 n3 ratio: 1.996 n4 ratio: 1.000 n5 ratio: 1.213 n6 ratio: 0.000 
interval 11:42:30-11:42:40: n1 ratio: 0.000 n2 ratio: 0.000 n3 ratio: 1.987 n4 ratio: 1.003 n5 ratio: 1.177 n6 ratio: 0.000 
interval 11:42:40-11:42:50: n1 ratio: 0.001 n2 ratio: 0.000 n3 ratio: 1.991 n4 ratio: 1.003 n5 ratio: 1.779 n6 ratio: 0.000 
interval 11:42:50-11:43:00: n1 ratio: 0.000 n2 ratio: 0.000 n3 ratio: 1.989 n4 ratio: 1.003 n5 ratio: 1.778 n6 ratio: 0.000 
interval 11:43:00-11:43:10: n1 ratio: 0.000 n2 ratio: 0.005 n3 ratio: 1.984 n4 ratio: 1.003 n5 ratio: 1.786 n6 ratio: 0.000 
interval 11:43:10-11:43:20: n1 ratio: 0.000 n2 ratio: 0.000 n3 ratio: 1.989 n4 ratio: 0.997 n5 ratio: 1.782 n6 ratio: 0.000 
interval 11:43:20-11:43:30: n1 ratio: 0.000 n2 ratio: 0.000 n3 ratio: 1.987 n4 ratio: 1.000 n5 ratio: 1.785 n6 ratio: 0.000 
interval 11:43:30-11:43:40: n1 ratio: 0.000 n2 ratio: 0.000 n3 ratio: 1.989 n4 ratio: 1.000 n5 ratio: 1.784 n6 ratio: 0.000 
interval 11:43:40-11:43:50: n1 ratio: 0.000 n2 ratio: 0.000 n3 ratio: 1.993 n4 ratio: 1.000 n5 ratio: 1.785 n6 ratio: 0.000 
interval 11:43:50-11:44:00: n1 ratio: 0.000 n2 ratio: 0.000 n3 ratio: 1.984 n4 ratio: 1.000 n5 ratio: 1.783 n6 ratio: 0.000 
interval 11:44:00-11:44:10: n1 ratio: 0.000 n2 ratio: 0.000 n3 ratio: 1.991 n4 ratio: 1.000 n5 ratio: 1.780 n6 ratio: 0.000 
interval 11:44:10-11:44:20: n1 ratio: 0.000 n2 ratio: 0.000 n3 ratio: 1.989 n4 ratio: 1.000 n5 ratio: 1.779 n6 ratio: 0.000 
interval 11:44:20-11:44:30: n1 ratio: 0.000 n2 ratio: 0.000 n3 ratio: 1.984 n4 ratio: 1.000 n5 ratio: 1.614 n6 ratio: 0.000 
interval 11:44:30-11:44:40: n1 ratio: 0.000 n2 ratio: 0.000 n3 ratio: 1.991 n4 ratio: 1.000 n5 ratio: 1.568 n6 ratio: 0.000 
interval 11:44:40-11:44:50: n1 ratio: 0.000 n2 ratio: 0.000 n3 ratio: 1.989 n4 ratio: 1.000 n5 ratio: 1.786 n6 ratio: 0.000 
interval 11:44:50-11:45:00: n1 ratio: 0.000 n2 ratio: 0.000 n3 ratio: 1.989 n4 ratio: 1.000 n5 ratio: 1.796 n6 ratio: 0.000 
(mixedversion.go:710).Run: panic (stack trace above): t.Fatal() was called
test artifacts and logs in: /artifacts/follower-reads/mixed-version/survival=region/locality=global/reads=strong/run_1

Parameters:

See: roachtest README

See: How To Investigate (internal)

See: Grafana

This test on roachdash | Improve this report!