Closed cockroach-teamcity closed 1 year ago
$ cockroach debug merge-logs logs/*.unredacted | grep 'initiating a split' | grep -v 'span config'
teamcity-10025419-1683782395-02-n4cpu4-0001> I230511 05:37:35.070249 199814 kv/kvserver/replica_command.go:412 ⋮ [T1,n1,split,s1,r64/1:‹/{Table/106-Max}›] 208 initiating a split of this range at key /Table/106/1/‹522779273917878259› [r65] (‹load at key /Table/106/1/522779273917878259 (cpu 1.2s, 1282.11 batches/sec, 51.08 raft mutations/sec)›)‹›
teamcity-10025419-1683782395-02-n4cpu4-0001> I230511 05:37:46.125156 299227 kv/kvserver/replica_command.go:412 ⋮ [T1,n1,split,s1,r65/1:‹/{Table/106/1/…-Max}›] 217 initiating a split of this range at key /Table/106/2 [r66] (‹load at key /Table/106/2 (cpu 750ms, 872.89 batches/sec, 27.15 raft mutations/sec)›)‹›
teamcity-10025419-1683782395-02-n4cpu4-0001> I230511 05:37:46.135101 299050 kv/kvserver/replica_command.go:412 ⋮ [T1,n1,split,s1,r64/1:‹/Table/106{-/1/5227…}›] 218 initiating a split of this range at key /Table/106/1/‹-1908700725658152825› [r67] (‹load at key /Table/106/1/-1908700725658152825 (cpu 894ms, 959.78 batches/sec, 28.96 raft mutations/sec)›)‹›
teamcity-10025419-1683782395-02-n4cpu4-0003> I230511 05:38:08.866451 229216 kv/kvserver/replica_command.go:412 ⋮ [T1,n3,split,s3,r64/4:‹/Table/106{-/1/-190…}›] 139 initiating a split of this range at key /Table/106/1/‹-6202310547831847912› [r84] (‹load at key /Table/106/1/-6202310547831847912 (cpu 1.2s, 1610.52 batches/sec, 28.25 raft mutations/sec)›)‹›
teamcity-10025419-1683782395-02-n4cpu4-0002> I230511 05:38:20.871923 265695 kv/kvserver/replica_command.go:412 ⋮ [T1,n2,split,s2,r67/2:‹/Table/106/1/{-19087…-522779…}›] 178 initiating a split of this range at key /Table/106/1/‹-133697430637084912› [r94] (‹load at key /Table/106/1/-133697430637084912 (cpu 487ms, 1355.69 batches/sec, 7.09 raft mutations/sec)›)‹›
@kvoli could you take a look at what's expected here? My intuition is that there's enough randomness in this test to occasionally see a couple more splits than hard-coded in the test.
I got the split values experimentally over a hundred runs of this test. I bumped the bounds slightly from the max/min of those samples.
You're right there probably is enough randomness that more splits could occur - between the workload runner selecting start keys, weighted split finder resovoir sampling and CPU usage for the same span request being different.
The goal of the test is to prevent against regressions where there are an outlandish (amplifying) number of splits, or no splits at all.
In this case there were 5 splits which seems fine and also correct given the error message being logged.
has 6 ranges, expected between 2 and 5 splits
I'll open a PR to bump the expected range higher and fix the error message.
roachtest.splits/load/spanning/nodes=4/obj=cpu failed with artifacts on master @ 992b8aa4eea4898c8b0ee83a1da289bc1933b91a:
Parameters:
ROACHTEST_cloud=gce
,ROACHTEST_cpu=4
,ROACHTEST_encrypted=false
,ROACHTEST_ssd=0
Help
See: [roachtest README](https://github.com/cockroachdb/cockroach/blob/master/pkg/cmd/roachtest/README.md) See: [How To Investigate \(internal\)](https://cockroachlabs.atlassian.net/l/c/SSSBr8c7)
/cc @cockroachdb/kv-triage
This test on roachdash | Improve this report!
Jira issue: CRDB-27827