Open quicklyfast opened 3 days ago
you can
parallelism = 3
in Source Readerparallelism = 1
in Paimon SinkI have tested and reviewed the SeaTunnel 2.3.7
source code, and found that the sink does not support configuring the parallelism parameter. The parallelism of the sink will be equal to the upstream parallelism.
env { parallelism = 3 job.mode = "BATCH" checkpoint.interval = 10000 job.retry.times = 0 }
source { Jdbc { parallelism = 3 url = "jdbc:mysql://192.168.1.132:19030/test" driver = "com.mysql.jdbc.Driver" connection_check_timeout_sec = 100 user = "root" password = "****" table_path = "test.ods_ecrp_kd_order" partition_column = "id" split.size = 1000 properties { useSSL=false useCursorFetch=true fetchSize=1000 } } }
sink { Paimon { parallelism = 1 catalog_name="seatunnel_test" warehouse="file:/home/data/seatunnel_test" database="${database_name}" paimon.table.primary-keys = "id,create_time" paimon.table.write-props = { bucket = 4 bucket-key="create_time" snapshot.num-retained.min = 3 snapshot.num-retained.max = 10 file.format = "orc" deletion-vectors.enabled = "true" } table="${table_name}" } }
* job info
"createTime": "2024-09-30 09:16:21", "jobDag": { "vertices": [ { "id": 1, "name": "Source[0]-Jdbc(id=1)", "parallelism": 3 }, { "id": 3, "name": "Sink[0]-Paimon-MultiTableSink(id=3)", "parallelism": 3 } ], "edges": [ { "inputVertex": "Source[0]-Jdbc", "targetVertex": "Sink[0]-Paimon-MultiTableSink" } ] }
Search before asking
Description
We want to concurrently sync data from MySQL to Paimon's
PK table
. However, due to the limitations of the PaimonPK table
, we cannot update it concurrently and must set the concurrency level for reading to 1, which affects the efficiency of data synchronization to Paimon. We hope that we can set up the ability to read source data concurrently with three threads and then use a single thread to write to the Paimon table.Usage Scenario
No response
Related issues
No response
Are you willing to submit a PR?
Code of Conduct