Potential Failure Mode: Mixed version scenarios may not be adequately tested for certain operations which could lead to incorrect results or corruption with larger version jumps
Worries:
mixed-skipped version state seems problematic, i.e.
24.1 → 24.3. (no finalization at 24.2)
until you finalize (run the upgrade), you're in mixed state (still at 24.1)
(customers are hesitant to finalize, b/c they can't rollback)
serialized data - could run issues
rolling restart state
we can't drop compatibility for a while
there are things which may NOT be version gated, e.g. protobuf field
Eng Team: Test Eng
Recommendation: run all tests in mixed version state to increase our confidence.
Roachtests:
long-running. can stay in mixed version state.
this would be the weekly job (not nightly).
Right now there is a mixed-version, multi-region. TPCC. runs for ~48- hours.
New work would be: to extend the time of this in the mixed version state.
things to iron out:
number of mixed versions, and length of time in each state
SQL Logic Tests:
can't really test aspect of long-running (quick, checks for correctness).
Potential Failure Mode: Mixed version scenarios may not be adequately tested for certain operations which could lead to incorrect results or corruption with larger version jumps
Worries:
Eng Team: Test Eng Recommendation: run all tests in mixed version state to increase our confidence.
Roachtests:
SQL Logic Tests:
Jira issue: CRDB-39089
Epic RE-572