Migrating table properties including partition fields, key generators, payload type, bootstrap index type. Handling both upgrade and downgrade
Migrating timeline to new layout: a) archived to LSM timeline layout, b) read both json/avro commit metadata, c) rename instants (including clustering action). These are all done for upgrade. For downgrade, I need to write a LSM to legacy archive timeline v1 writer.
Full compact the table, to get rid of log files. Both in case of upgrade and downgrade.
Drop version 7.
Some tests for above.
TODO:
LSM to legacy archived timeline writer to use in downgrade.
Migration path for CDC and incremental queries.
Handle differences between 0.x and 1.x for stuff needed in upgrade e.g compaction (need to compact older file slice), rollback (marker differences b/w 0.14 and 0.15). Though if we compact and delete any leftover markers, it might be okay. Need to test these scenarios.
Impact
Support rolling upgrade to table version 8.
Risk level (write none, low medium or high below)
high
Documentation Update
Describe any necessary documentation update if there is any new feature, config, or user-facing change. If not, put "none".
The config description must be updated if new configs are added or the default value of the configs are changed
Any new feature or user-facing change requires updating the Hudi website. Please create a Jira ticket, attach the
ticket number here and follow the instruction to make
changes to the website.
Change Logs
TODO:
Impact
Support rolling upgrade to table version 8.
Risk level (write none, low medium or high below)
high
Documentation Update
Describe any necessary documentation update if there is any new feature, config, or user-facing change. If not, put "none".
Contributor's checklist