Open luckyliush opened 1 month ago
try to add rewriteBatchedStatements=true
parameter to your jdbc url
try to add
rewriteBatchedStatements=true
parameter to your jdbc url
Thank you, but this parameter has been added before,did not meet expectations
try to add
rewriteBatchedStatements=true
parameter to your jdbc urlThank you, but this parameter has been added before,did not meet expectations
plugin_name = jdbc
user = xxxxx
url = "jdbc:mysql://xxxxxx/xxxxxxx?allowMultiQueries=true&useUnicode=true&characterEncoding=UTF-8&serverTimezone=Asia/Shanghai&zeroDateTimeBehavior=convertToNull&tinyInt1isBit=false&rewriteBatchedStatements=true&useSSL=false"
enable_upsert = true
generate_sink_sql = true
database = db_name
table = table_name
primary_keys = [xxx,xxx]
try with this config, it will auto generate insert sql, i use this config, the write speed is good
try to add
rewriteBatchedStatements=true
parameter to your jdbc urlThank you, but this parameter has been added before,did not meet expectations
plugin_name = jdbc user = xxxxx url = "jdbc:mysql://xxxxxx/xxxxxxx?allowMultiQueries=true&useUnicode=true&characterEncoding=UTF-8&serverTimezone=Asia/Shanghai&zeroDateTimeBehavior=convertToNull&tinyInt1isBit=false&rewriteBatchedStatements=true&useSSL=false" enable_upsert = true generate_sink_sql = true database = db_name table = table_name primary_keys = [xxx,xxx]
try with this config, it will auto generate insert sql, i use this config, the write speed is good
`env { execution.parallelism = 10 job.mode = "BATCH" }
source { Hive { table_name = "" metastore_uri = "" result_table_name = "Table_test" hdfs_site_path = "/home/hadoop/hadoop-3.2.2/etc/hadoop/hdfs-site.xml" hive_site_path = "/home/hadoop/hive-2.3.9/conf/hive-site.xml" } }
transform { sql { source_table_name="Table_test" query = "select xxx,xxx from Table_test" result_table_name = "Table_test2" } }
sink { Jdbc {
url = "jdbc:mysql://xxx:3306/xxx?allowMultiQueries=true&useUnicode=true&characterEncoding=UTF-8&serverTimezone=Asia/Shanghai&zeroDateTimeBehavior=convertToNull&tinyInt1isBit=false&rewriteBatchedStatements=true&useSSL=false" driver = "com.mysql.cj.jdbc.Driver" user = "root" enable_upsert = true generate_sink_sql = true password = "xxx" database = "xxx" primary_keys = [xxx,xxx,xxx] table = "xxx" } }`
The version I am using is 2.3.1, and the configuration is as shown above, but the extraction speed is the same as before and has not improved. Is there something wrong with my configuration? Approximately 12,000 pieces of data can be extracted per second. I passed the primary key parameters based on the granular fields of the hive table, but these fields are not set as primary keys in the mysql table. Does this have any impact?
try to add
rewriteBatchedStatements=true
parameter to your jdbc urlThank you, but this parameter has been added before,did not meet expectations
plugin_name = jdbc user = xxxxx url = "jdbc:mysql://xxxxxx/xxxxxxx?allowMultiQueries=true&useUnicode=true&characterEncoding=UTF-8&serverTimezone=Asia/Shanghai&zeroDateTimeBehavior=convertToNull&tinyInt1isBit=false&rewriteBatchedStatements=true&useSSL=false" enable_upsert = true generate_sink_sql = true database = db_name table = table_name primary_keys = [xxx,xxx]
try with this config, it will auto generate insert sql, i use this config, the write speed is good
Hello, now in the seatunnel-2.3.5 version, using the same configuration, the parallelism parameter will not take effect. But after adding the parameter read_limit.rows_per_second=10000 to seatunnel-2.3.5, the parallelism parameter will take effect and the extraction speed will be significantly improved. Do you know the reason?
This issue has been automatically marked as stale because it has not had recent activity for 30 days. It will be closed in next 7 days if no further activity occurs.
I had some issue at 2.3.5 version,the parallelism parameter is not work,only one parallelism working
Search before asking
What happened
The amount of data is 15 million rows. The speed of extracting hive data to mysql can only reach 13,000 records per second, and it takes nearly 30 minutes to complete the extraction. But I set the seatunnel parallelism parameter to 10. After the data extraction task started, I used the show processlist command on the MySQL client and found that there was only one insert thread. Then I tested extracting hive data to the console. The same amount of data took less than 1 minute.
SeaTunnel Version
2.3.1 and 2.3.5
SeaTunnel Config
Running Command
Error Exception
Zeta or Flink or Spark Version
Zeta
Java or Scala Version
1.8
Screenshots
No response
Are you willing to submit PR?
Code of Conduct