DataLinkDC / dinky

Dinky is a real-time data development platform based on Apache Flink, enabling agile data development, deployment and operation.
http://www.dinky.org.cn
Apache License 2.0
3.15k stars 1.16k forks source link

[Feature][CDCSOURCE] Supports filter data #2388

Open stdnt-xiao opened 1 year ago

stdnt-xiao commented 1 year ago

Search before asking

Description

Expect to support data filtering to achieve bidirectional synchronization of multi-center databases. Use filtering functionality to prevent data loops. like: https://www.confluent.io/blog/sync-databases-and-remove-silos-with-kafka-cdc/

My current idea is to improve the parameters in 'flink-kafka-connector' and filter the data based on 'sink.filter.pattern' rules in the 'write' function.

Have any better suggestions? thx.

EXECUTE CDCSOURCE jobname WITH ( 'connector' = 'mysql-cdc', 'hostname' = '127.0.0.1', 'port' = '3306', 'username' = 'dlink', 'password' = 'dlink', 'checkpoint' = '3000', 'scan.startup.mode' = 'initial', 'parallelism' = '1', 'table-name' = 'test.student,test.score', 'sink.connector'='datastream-kafka', 'sink.brokers'='127.0.0.1:9092', 'sink.filter.pattern': '$[?(@.SRC == \"SQLSRV\")]',
'sink.filter.model": 'exclude' )

Use case

No response

Related issues

No response

Are you willing to submit a PR?

Code of Conduct

aiwenmo commented 1 year ago

Worth a try.

github-actions[bot] commented 9 months ago

Hello, this issue has not been active for more than 30 days. This issue will be closed in 7 days if there is no response. If you have any questions, you can comment and reply.

你好, 这个 issue 30 天内没有活跃,7 天后将关闭,如需回复,可以评论回复。