[X] I searched in the issues and found nothing similar.
Motivation
AWS Database Migration Service (DMS) is a service that can perform homogeneous and heterogeneous database migration, it can migrate data and replicate ongoing changes, which helps to build data lakes and perform real-time processing on change data from your data stores.
By adding support of DMS CDC data format, developers can build streaming data lake by streaming CDC data to Kafka and ingesting data into data lake with Apache Paimon table format, and this is awesome.
The DMS CDC data format is similarly like with Maxwell, you can find the detail about the format and using kafka as DMS target from this link, Here is the detail DMS JSON format:
RecordType
The record type can be either data or control. Data records represent the actual rows in the source. Control records are for important events in the stream, for example a restart of the task.
Operation
For data records, the operation can be load, insert, update, or delete.
For control records, the operation can be create-table, rename-table, drop-table, change-columns, add-column, drop-column, rename-column, or column-type-change.
SchemaName
The source schema for the record. This field can be empty for a control record.
TableName
The source table for the record. This field can be empty for a control record.
Timestamp
The timestamp for when the JSON message was constructed. The field is formatted with the ISO 8601 format.
The following JSON message example illustrates a data type message with all additional metadata.
Search before asking
Motivation
AWS Database Migration Service (DMS) is a service that can perform homogeneous and heterogeneous database migration, it can migrate data and replicate ongoing changes, which helps to build data lakes and perform real-time processing on change data from your data stores.
By adding support of DMS CDC data format, developers can build streaming data lake by streaming CDC data to Kafka and ingesting data into data lake with Apache Paimon table format, and this is awesome.
The DMS CDC data format is similarly like with Maxwell, you can find the detail about the format and using kafka as DMS target from this link, Here is the detail DMS JSON format:
RecordType The record type can be either data or control. Data records represent the actual rows in the source. Control records are for important events in the stream, for example a restart of the task.
Operation For data records, the operation can be
load
,insert
,update
, ordelete
.For control records, the operation can be
create-table
,rename-table
,drop-table
,change-columns
,add-column
,drop-column
,rename-column
, or column-type-change.SchemaName The source schema for the record. This field can be empty for a control record.
TableName The source table for the record. This field can be empty for a control record.
Timestamp The timestamp for when the JSON message was constructed. The field is formatted with the ISO 8601 format.
The following JSON message example illustrates a data type message with all additional metadata.
Solution
Provider an AWS DMS CDC format implementation in Apache Paimon.
Anything else?
No response
Are you willing to submit a PR?