Open adityamaru opened 1 year ago
Hi @adityamaru, please add branch-* labels to identify which branch(es) this release-blocker affects.
:owl: Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.
I've slapped on the GA-blocker label, but technically this is not a regression from 22.2 when this infrastructure was first introduced so we can re-evaluate.
I'll keep this open until the 23.1 backport merges
In our test cluster we observed two ranges corresponding to dropped tables not respecting protection policies enforced by SystemSpanConfigs. The replicas of these empty ranges were still evaluating batch requests and causing them to fail with
‹ERROR: batch timestamp 1698960957.937065386,0 must be after replica GC threshold 1698967771.741796894,0 (SQLSTATE XXUUU)›
. The running theory is that with the following sequence of operations:We end up in a situation where the
spanconfigstore
does not find any overlapping span configs for the empty range and so does not run the logic to check if any system span configs apply to that range - https://github.com/cockroachdb/cockroach/blob/master/pkg/spanconfig/spanconfigstore/store.go#L199. In this way it misses any protection policies that should hold up the GCThreshold and allows GC to move past the protected timestamp. We think it makes sense to apply a default zone config with the system span configs combined into it to such ranges. We are still attempting to reproduce this locally.Jira issue: CRDB-33239