apache / doris

Apache Doris is an easy-to-use, high performance and unified analytics database.
https://doris.apache.org
Apache License 2.0
12.64k stars 3.26k forks source link

通过canal订阅binlog , 同步数据不生效 #15877

Closed tempestLXC closed 1 year ago

tempestLXC commented 1 year ago

Search before asking

Version

1.2.0

What's Wrong?

mysql 表: create table child.source_test ( id int auto_increment comment '111' primary key, name int not null comment '111' );

doris表:

CREATE TABLE test_child_ods.target_test ( id int(11), name int(11) ) ENGINE=OLAP UNIQUE KEY(id) DISTRIBUTED BY HASH(id) BUCKETS 8;

SYNC JOB: CREATE SYNC test_child_ods.test_child_job ( FROM child.source_test INTO target_test (id,name) ) FROM BINLOG ( "type"="canal", "canal.server.ip"="xx.xx.xx.xx", "canal.server.port"="11111", "canal.destination"="test_child_job", "canal.username"="", "canal.password"="", "canal.debug"="true" );

mysql侧新增一行数据, doris fe日志中打印了订阅表的 binlog日志. 但是doris对应的表没有更新成功.

fe 日志:

2023-01-12 20:58:56,584 INFO (Thread-68|113) [ReportHandler.taskReport():376] finished to handle task report from backend 11002, diff task num: 0. cost: 0 ms 2023-01-12 20:58:56,584 INFO (thrift-server-pool-1|151) [ReportHandler.handleReport():179] receive report from be 11002. type: TASK, current queue size: 1 2023-01-12 20:58:59,576 INFO (Thread-68|113) [ReportHandler.taskReport():376] finished to handle task report from backend 11003, diff task num: 0. cost: 0 ms 2023-01-12 20:58:59,576 INFO (thrift-server-pool-5|219) [ReportHandler.handleReport():179] receive report from be 11003. type: TASK, current queue size: 1 2023-01-12 20:59:01,575 INFO (tablet checker|30) [TabletChecker.checkTablets():331] finished to check tablets. unhealth/total/added/in_sched/not_ready: 0/457/0/0/0, cost: 0 ms 2023-01-12 20:59:02,267 INFO (Thread-139|902) [CanalUtils.printSummary():72] ----------- Batch Summary ------------------------------> | Batch Id: [33038] ,count : [2] , Mem size : [106] , Time : 2023-01-12 20:59:02 | Start : [mysql-bin.001015:49702262:1673528342000(2023-01-12 20:59:02)] | End : [mysql-bin.001015:49702491:1673528342000(2023-01-12 20:59:02)] ----------------------------------------------------------

2023-01-12 20:59:03,564 INFO (Thread-68|113) [ReportHandler.taskReport():376] finished to handle task report from backend 11001, diff task num: 0. cost: 0 ms 2023-01-12 20:59:03,564 INFO (thrift-server-pool-12|874) [ReportHandler.handleReport():179] receive report from be 11001. type: TASK, current queue size: 1 2023-01-12 20:59:05,438 INFO (thrift-server-pool-7|228) [DatabaseTransactionMgr.commitTransaction():664] transaction:[TransactionState. transaction id: 199223, label: 978ec977c304cc9-811adb1c2311872e, db id: 14994, table id list: 15867, callback id: 15909, coordinator: FE: 172.19.107.56, transaction status: COMMITTED, error replicas num: 0, replica ids: , prepare time: 1673528335222, commit time: 1673528345437, finish time: -1, reason: attactment: RLTaskTxnCommitAttachment [filteredRows=0, loadedRows=399, unselectedRows=0, receivedBytes=150595, taskExecutionTimeMs=10215, taskId=TUniqueId(hi:682555478914387145, lo:-9143755179528452306), jobId=15909, progress=KafkaProgress [partitionIdToOffset=0_55065579]]] successfully committed 2023-01-12 20:59:05,439 INFO (PUBLISH_VERSION|20) [PublishVersionDaemon.publishVersion():134] send publish tasks for transaction: 199223 2023-01-12 20:59:05,450 INFO (PUBLISH_VERSION|20) [DatabaseTransactionMgr.finishTransaction():962] finish transaction TransactionState. transaction id: 199223, label: 978ec977c304cc9-811adb1c2311872e, db id: 14994, table id list: 15867, callback id: 15909, coordinator: FE: 172.19.107.56, transaction status: VISIBLE, error replicas num: 0, replica ids: , prepare time: 1673528335222, commit time: 1673528345437, finish time: 1673528345449, reason: attactment: RLTaskTxnCommitAttachment [filteredRows=0, loadedRows=399, unselectedRows=0, receivedBytes=150595, taskExecutionTimeMs=10215, taskId=TUniqueId(hi:682555478914387145, lo:-9143755179528452306), jobId=15909, progress=KafkaProgress [partitionIdToOffset=0_55065579]] successfully 2023-01-12 20:59:06,330 INFO (Thread-139|902) [CanalUtils.printSummary():72] ----------- Batch Summary ------------------------------> | Batch Id: [33039] ,count : [3] , Mem size : [146] , Time : 2023-01-12 20:59:06 | Start : [mysql-bin.001015:49719002:1673528345000(2023-01-12 20:59:05)] | End : [mysql-bin.001015:49719173:1673528345000(2023-01-12 20:59:05)] ----------------------------------------------------------

2023-01-12 20:59:06,330 INFO (Thread-139|902) [CanalUtils.printRow():123] ----------------> binlog[mysql-bin.001015:49719131] , name[child,source_test] , eventType : INSERT , executeTime : 1673528345000(2023-01-12 20:59:05) , gtid : () , delay : 1330 ms

2023-01-12 20:59:06,330 INFO (Thread-139|902) [CanalUtils.printColumn():168] id : 2 type=int update=true name : 2 type=int update=true

2023-01-12 20:59:08,099 INFO (Thread-139|902) [CanalUtils.printSummary():72] ----------- Batch Summary ------------------------------> | Batch Id: [33040] ,count : [2] , Mem size : [106] , Time : 2023-01-12 20:59:08 | Start : [mysql-bin.001015:49727163:1673528348000(2023-01-12 20:59:08)] | End : [mysql-bin.001015:49727392:1673528348000(2023-01-12 20:59:08)] ----------------------------------------------------------

2023-01-12 20:59:09,584 INFO (Thread-68|113) [ReportHandler.taskReport():376] finished to handle task report from backend 11002, diff task num: 0. cost: 0 ms 2023-01-12 20:59:09,584 INFO (thrift-server-pool-3|156) [ReportHandler.handleReport():179] receive report from be 11002. type: TASK, current queue size: 1

What You Expected?

应该同步成功才对啊,

How to Reproduce?

No response

Anything Else?

mysql 版本: aliyun RDS canal 版本: 1.1.6

配置文件fe.conf:

# the output dir of stderr and stdout LOG_DIR = /data/doris/fe/logs

DATE = `date +%Y%m%d-%H%M%S` JAVA_OPTS="-Xmx8192m -XX:+UseMembar -XX:SurvivorRatio=8 -XX:MaxTenuringThreshold=7 -XX:+PrintGCDateStamps -XX:+PrintGCDetails -XX:+UseConcMarkSweepGC -XX:+UseParNewGC -XX:+CMSClassUnloadingEnabled -XX:-CMSParallelRemarkEnabled -XX:CMSInitiatingOccupancyFraction=80 -XX:SoftRefLRUPolicyMSPerMB=0 -Xloggc:$DORIS_HOME/log/fe.gc.log.$DATE"

# For jdk 9+, this JAVA_OPTS will be used as default JVM options JAVA_OPTS_FOR_JDK_9="-Xmx8192m -XX:SurvivorRatio=8 -XX:MaxTenuringThreshold=7 -XX:+CMSClassUnloadingEnabled -XX:-CMSParallelRemarkEnabled -XX:CMSInitiatingOccupancyFraction=80 -XX:SoftRefLRUPolicyMSPerMB=0 -Xlog:gc*:$DORIS_HOME/log/fe.gc.log.$DATE:time"

## ## the lowercase properties are read by main program. ##

# INFO, WARN, ERROR, FATAL sys_log_level = INFO

# store metadata, must be created before start FE. # Default value is ${DORIS_HOME}/doris-meta meta_dir = /data/doris/fe/doris-meta

http_port = 8030 rpc_port = 9020 query_port = 9030 edit_log_port = 9010 mysql_service_nio_enabled = true

# Choose one if there are more than one ip except loopback address. # Note that there should at most one ip match this list. # If no ip match this rule, will choose one randomly. # use CIDR format, e.g. 10.10.10.0/24 # Default value is empty. priority_networks = xx.xx.xx.xx/24

# Advanced configurations # log_roll_size_mb = 1024 sys_log_dir = /data/doris/fe/logs # sys_log_roll_num = 10 # sys_log_verbose_modules = org.apache.doris audit_log_dir = /data/doris/fe/logs audit_log_modules = slow_query, query # audit_log_roll_num = 10 # meta_delay_toleration_second = 10 # qe_max_connection = 1024 # max_conn_per_user = 100 # qe_query_timeout_second = 300 # qe_slow_log_ms = 5000 enable_create_sync_job = true sync_commit_interval_second = 10000 enable_batch_delete_by_default=true

Are you willing to submit PR?

Code of Conduct

yjyJames commented 1 year ago

用show sync job看看同步任务是否在进行,有没有报错。如果同步任务已经pause了,说明配置有问题

tempestLXC commented 1 year ago

@yjyJames 用show sync job看看同步任务是否在进行,有没有报错。如果同步任务已经pause了,说明配置有问题

sync job 是RUNNING状态, 因为我设置了"canal.debug"="true" FE日志已经打印了接收到的binlog日志了:

---------------> binlog[mysql-bin.001015:49719131] , name[child,source_test] , eventType : INSERT , executeTime : 1673528345000(2023-01-12 20:59:05) , gtid : () , delay : 1330 ms

2023-01-12 20:59:06,330 INFO (Thread-139|902) [CanalUtils.printColumn():168] id : 2 type=int update=true name : 2 type=int update=true

2023-01-12 20:59:08,099 INFO (Thread-139|902) [CanalUtils.printSummary():72] ----------- Batch Summary ------------------------------> | Batch Id: [33040] ,count : [2] , Mem size : [106] , Time : 2023-01-12 20:59:08 | Start : [mysql-bin.001015:49727163:1673528348000(2023-01-12 20:59:08)] | End : [mysql-bin.001015:49727392:1673528348000(2023-01-12 20:59:08)] ----------------------------------------------------------

目前看 binlog数据已经成功到达Doris , 并且FE日志看 binlog 解析也正确, 但是不知道为什么表里一直没数据

tempestLXC commented 1 year ago

sync_commit_interval_second = 10000