apache / seatunnel

SeaTunnel is a next-generation super high-performance, distributed, massive data integration tool.
https://seatunnel.apache.org/
Apache License 2.0
8.01k stars 1.81k forks source link

sync data speed is too low, please help me #7133

Open zhangtao106 opened 4 months ago

zhangtao106 commented 4 months ago

Code of Conduct

Search before asking

Describe the proposal

I was synchronized data from DRDS to Doris, my query range is “id < 50000000”, but total data less than 50000000, when i synchorized, that speed was very slow! less than datax much .
 my data  in DRDS has 42  cloumns, and this is my conf:

env { checkpoint.interval = 10000 job.mode = "batch" } source{ Jdbc { url = "jdbc:mysql://x.x.x.x:3306/pp_data_ds?useSSL=false&serverTimezone=GMT%2B8&useUnicode=true&characterEncoding=utf8&socketTimeout=6000000&waitTimeout=60000&autoReconnect=true&rewriteBatchedStatements=true&connectTimeout=60000" driver = "com.mysql.cj.jdbc.Driver" connection_check_timeout_sec = 100 user = "user_ro" password = "******" query = "selectprovince_name`, ... from table_data where id >0 and id < 50000000"

}

}

sink { Doris { fenodes = "x.x.x.x:8030" username = test password = "***" database = "test" table = "td_sub_sms_port_bk" sink.label-prefix = "test-batch" sink.enable-2pc = "true" sink.enable-delete = "true" doris.config { format = "json" read_json_by_line = "true" } } }`

when I submit job config into cluster, log info is update slow , aaaa

Task list

env { checkpoint.interval = 10000 job.mode = "batch" } source{ Jdbc { url = "jdbc:mysql://x.x.x.x:3306/pp_data_ds?useSSL=false&serverTimezone=GMT%2B8&useUnicode=true&characterEncoding=utf8&socketTimeout=6000000&waitTimeout=60000&autoReconnect=true&rewriteBatchedStatements=true&connectTimeout=60000" driver = "com.mysql.cj.jdbc.Driver" connection_check_timeout_sec = 100 user = "user_ro" password = "**" query = "select province_name, ... from table_data where id >0 and id < 50000000"

}

}

sink { Doris { fenodes = "x.x.x.x:8030" username = test password = "***" database = "test" table = "td_sub_sms_port_bk" sink.label-prefix = "test-batch" sink.enable-2pc = "true" sink.enable-delete = "true" doris.config { format = "json" read_json_by_line = "true" } } }

Are you willing to submit PR?

github-actions[bot] commented 3 months ago

This issue has been automatically marked as stale because it has not had recent activity for 30 days. It will be closed in next 7 days if no further activity occurs.

panpan2019 commented 3 months ago

I can fix

Anoye commented 3 weeks ago

I encountered the same problem, have you solved it yet

richardblabin commented 3 weeks ago

I can fix

how? more specific please.thanks a lot