rangefeed: add data-driven test harness

erikgrinaker commented 2 years ago

We should have a data-driven test harness for rangefeeds. This would set up a testserver/cluster, use datadriven to run commands against it, and run a rangefeed.Factory-based client to record rangefeed responses. This would allow much more exhaustive testing of e.g. catchup scans and event handling. However, care must be taken to make this deterministic -- for example, we can't have timestamps anywhere, and we need to wait for each event after running a command to avoid ordering issues (i.e. take the current timestamp and wait for the checkpoint to progress past it). It should support error handling, and use multiple ranges/nodes.

Functionality needed to be on par with existing non-datadriven tests:

[x] put keys
[x] create rangefeeds
[x] split range
[ ] verify version sequence
[x] verify correctness of prev values
[x] verify checkpoint invariant (cp is below all future events)
[ ] verify frontier invariant (f is below all future events)
[x] ensure checkpoint post timestamp
[x] put key registering its write timestamp
[x] wait for first checkpoint
[x] put sstable
[x] dump sstable
[x] put range tombstone
[ ] create rangefeed with initial scan
[x] issue clear range
[x] capture unrecoverable errors

Functionality needs adding:

[x] intent creation
[x] intent resolution

Some more ad-hoc settings that might be left out

[ ] set cluster settings
[ ] span config propagations

Functionality for randomized tests

[ ] verify key update completeness
[ ] generate random key sequence
[ ] move replica
[ ] move leaseholder
[ ] merge range

Not all of those are translated to data driven statements directly.

Jira issue: CRDB-16628

Epic CRDB-39959

blathers-crl[bot] commented 2 years ago

cc @cockroachdb/replication

erikgrinaker commented 1 year ago

@aliher1911 This seems like it'd be very useful for the rangefeed refactor, reassigning to you and pulling into 23.2.

cockroachdb / cockroach

rangefeed: add data-driven test harness #82715