apache / seatunnel

SeaTunnel is a next-generation super high-performance, distributed, massive data integration tool.
https://seatunnel.apache.org/
Apache License 2.0
8.07k stars 1.83k forks source link

[Umbrella] CDC DDL Sync Design(Zeta) #7930

Closed hailin0 closed 1 day ago

hailin0 commented 4 weeks ago

Code of Conduct

Search before asking

Describe the proposal

Backgroud

Currently, we have support for data change capture(CDC #3175), but no further design for schema evolution.

And as CDC data synchronization, I think we need to support schema evolution(DDL) as a feature, and I want to hear from you all how you think it can be implemented in SeaTunnel.

Motivation

Overall Design

Basic flow

image

Phase1 - Before Change

image

Phase2 - Starting Change

image

Phase3 - Splitting data flow and structure flow

image

Phase4 - Handling schema-change-before signal

image

Phase5 - Execute ddl into source & sink

image

Phase6 - Handling schema-change-after signal

image

Phase7 - Completed

image

Task list

Are you willing to submit PR?

hailin0 commented 4 weeks ago

link https://github.com/apache/seatunnel/issues/3175

Carl-Zhou-CN commented 1 week ago

Niubility awesome

dailai commented 1 week ago

Niubility awesome

+1

Carl-Zhou-CN commented 1 week ago

Niubility awesome

+1

Ha ha, you learned