cockroachdb / cockroach

CockroachDB — the cloud native, distributed SQL database designed for high availability, effortless scale, and control over data placement.
https://www.cockroachlabs.com
Other
29.99k stars 3.79k forks source link

sql: TestRelocateNonVoters failed [expected "type of replica being removed (VOTER_FULL) does not match: {ChangeType:REMOVE_NON_VOTER Target:n4,s4}] #132078

Open cockroach-teamcity opened 2 days ago

cockroach-teamcity commented 2 days ago

sql.TestRelocateNonVoters failed on release-24.2 @ 08468de10d89483d52fecfcf3fb3892564e8df3a:

=== RUN   TestRelocateNonVoters
    test_log_scope.go:165: test logs captured to: outputs.zip/logTestRelocateNonVoters3350201624
    test_log_scope.go:76: use -show-logs to present logs inline
    multitenant_admin_function_test.go:956: -- test log scope end --
test logs left over in: outputs.zip/logTestRelocateNonVoters3350201624
--- FAIL: TestRelocateNonVoters (301.51s)
=== RUN   TestRelocateNonVoters/ALTER_RANGE_RELOCATE_NONVOTERS
    multitenant_admin_function_test.go:374: condition failed to evaluate within 3m45s: from multitenant_admin_function_test.go:411: expected "type of replica being removed (VOTER_FULL) does not match expectation for change: {ChangeType:REMOVE_NON_VOTER Target:n4,s4}" contains "ok" tenant=system query=``ALTER RANGE RELOCATE NONVOTERS FROM 4 TO 5 FOR (SELECT min(range_id) FROM [SHOW RANGES FROM TABLE t]);`` leaseholder=4 replicas=[1 2 3 4] voting_replicas=[1 2 3] non_voting_replicas=[4] fromReplica=4 toReplica=5 row=0 col=2
    --- FAIL: TestRelocateNonVoters/ALTER_RANGE_RELOCATE_NONVOTERS (264.08s)

Parameters:

See also: How To Investigate a Go Test Failure (internal)

/cc @cockroachdb/sql-foundations

This test on roachdash | Improve this report!

Jira issue: CRDB-42824

annrpom commented 18 hours ago
multitenant_admin_function_test.go:411: expected "type of replica being removed (VOTER_FULL) does not match
 expectation for change: {ChangeType:REMOVE_NON_VOTER Target:n4,s4}" contains "ok" tenant=system query=`ALTER
 RANGE RELOCATE NONVOTERS FROM 4 TO 5 FOR (SELECT min(range_id) FROM [SHOW RANGES FROM TABLE t]);`
 leaseholder=4 replicas=[1 2 3 4] voting_replicas=[1 2 3] non_voting_replicas=[4] fromReplica=4 toReplica=5 row=0
 col=2
rafiss commented 15 hours ago

It looks like the test was expecting to remove a non-voter replica, but it removed a voter.

This has failed in https://github.com/cockroachdb/cockroach/issues/129883 and https://github.com/cockroachdb/cockroach/issues/126541 as well; so I'm curious if the KV team has any advice on how to stabilize this further. Perhaps there is a testing knob that could help?

andrewbaptist commented 14 hours ago

I can change the allocator interval to scan much faster so it can recover from this issue in the case of the race between the two sources of rebalancing.