Closed trikker closed 7 months ago
会走非公平切分那一块逻辑,每个chunk size的大小应该不会特别夸张吧
The same issue here. @trikker Do you have some workaround that could share? Thanks!
这是来自QQ邮箱的自动回复邮件。你好,邮件正常收到,谢谢! -----万玉坤
This bug is caused by different charsets and collation rules in the MySQL database.
Because we use JAVA to compare the max value of table and the Chunk-End value to check the End-Bound, but the max value comes from the SELECT MAX(columnName) FROM TABLEANAME
, this result is affected by the database's character set and collation rules.
I have a pr to fixed it, but I'm not sure if it is appropriate or not;
https://github.com/ververica/flink-cdc-connectors/pull/2968
When use a primaryKey which is varchar, we can reproduce this problem; Example:
Hi, @trikker . Mysql CDC supports to set scan.incremental.snapshot.chunk.key-column
to select a column in the primary key to split chunks.
这是来自QQ邮箱的自动回复邮件。你好,邮件正常收到,谢谢! -----万玉坤
Hi, @trikker . Mysql CDC supports to set
scan.incremental.snapshot.chunk.key-column
to select a column in the primary key to split chunks. When the type of the partition key is set to varchar, the original logic of unevenly-chunked data can be affected by the database's character-set/sorting-rules, leading to the creation of very large chunks and causing OOM errors. This issue is unrelated to whether there are multiple primary keys.
Closing this issue as it has been migrated to Apache Jira.
这是来自QQ邮箱的自动回复邮件。你好,邮件正常收到,谢谢! -----万玉坤
Search before asking
Flink version
1.17.1
Flink CDC version
flink-connector-mysql-cdc-2.4.2
Database and its version
source: MySQL 8.0.19 destination: Doris 2.0.2
Minimal reproduce step
execute the below SQL for 5000000 times
insert into t values('bbb'), repeat('a', 64), now(6), repeat('a', 200), repeat('a', 500)); insert into t values('bbb'), repeat('a', 64), now(6), repeat('a', 200), repeat('a', 500)); insert into t values('bbb'), repeat('a', 64), now(6), repeat('a', 200), repeat('a', 500)); ...
execute the below SQL for 5000000 times
insert into t values('ccc'), repeat('a', 64), now(6), repeat('a', 200), repeat('a', 500)); insert into t values('ccc'), repeat('a', 64), now(6), repeat('a', 200), repeat('a', 500)); insert into t values('ccc'), repeat('a', 64), now(6), repeat('a', 200), repeat('a', 500)); ...
CREATE DATABASE
mydb
;create user flink identified by 'pass'; grant all on . to flink;
create user flink identified by 'pass'; grant all on . to flink;
flink-doris-connector-1.17-1.4.0.jar flink-sql-connector-mysql-cdc-2.4.2.jar
bin/flink run -d \ -Dexecution.checkpointing.interval=10s \ -Dparallelism.default=1 \ -c org.apache.doris.flink.tools.cdc.CdcTools \ lib/flink-doris-connector-1.17-1.4.0.jar \ mysql-sync-database \ --database idc_manager \ --job-name flink_sync_mysql_to_doris \ --mysql-conf hostname= \
--mysql-conf port=3306 \
--mysql-conf username=flink \
--mysql-conf password=pass \
--mysql-conf database-name=mydb\
--including-tables "t" \
--sink-conf fenodes=:8030 \
--sink-conf username=flink \
--sink-conf password=pass \
--sink-conf jdbc-url=jdbc:mysql://:9030 \
--sink-conf sink.label-prefix=label1 \
--table-conf replication_num=3
SELECT * FROM
mydb
.t
WHEREcol1
<= 'bbb' AND NOT (col1
= 'bbb');Caused by: java.lang.OutOfMemoryError: Java heap space