apache / doris

Apache Doris is an easy-to-use, high performance and unified analytics database.
https://doris.apache.org
Apache License 2.0
12.48k stars 3.24k forks source link

[Bug] register txn replica failed #15694

Closed zhouhoo closed 1 year ago

zhouhoo commented 1 year ago

Search before asking

Version

1.2.1

What's Wrong?

When use routine load to consume kafka data, the bellow error often occur:

2023-01-07 13:36:14,545 ERROR (Routine load task scheduler|36) [OlapTableSink.createLocation():397] register txn replica failed, txnId=-1, dbId=620560 2023-01-07 13:36:45,526 ERROR (Routine load task scheduler|36) [OlapTableSink.createLocation():397] register txn replica failed, txnId=-1, dbId=620560 2023-01-07 13:37:15,636 ERROR (Routine load task scheduler|36) [OlapTableSink.createLocation():397] register txn replica failed, txnId=-1, dbId=620560 2023-01-07 13:37:45,895 ERROR (Routine load task scheduler|36) [OlapTableSink.createLocation():397] register txn replica failed, txnId=-1, dbId=620560 2023-01-07 13:38:15,982 ERROR (Routine load task scheduler|36) [OlapTableSink.createLocation():397] register txn replica failed, txnId=-1, dbId=620560 2023-01-07 13:39:26,082 ERROR (Routine load task scheduler|36) [OlapTableSink.createLocation():397] register txn replica failed, txnId=-1, dbId=620560

What You Expected?

no error

How to Reproduce?

create a routine load job and run , the create sql is as fellow: -- 创建作业 CREATE ROUTINE LOAD kafka_gxy_student on gxy_students -- load_properties 导入描述 COLUMNS TERMINATED BY ",", -- 指定分隔符 COLUMNS( student_id,school_id,create_time,is_deleted,snow_flake_id,modified_time,user_id,school_name,academe_id,academe_name,major_id,major_name,major_field,classes_id,classes_name,username,student_number,gender,grade,bind_state,bind_time,mobile,family_name,family_mobile,family_address,family_province,family_city,family_area,birthday,nation,origin_province,origin_city,origin_area,card_no,age,is_employment,email,level,educational,is_practice,about_type,school_add_time,graduation_time,nationality,face,overseas,weixin,qq,bank_name,bank_account,practice_state,job_state,head_img,auth_code,backup,create_by,modified_by ) PROPERTIES -- 指定例行导入作业的通用参数 ( "desired_concurrent_number"="6", -- 作业并发度 "strict_mode"="false", -- 严格模式 "format" = "json", -- 类型 json或csv -- 以下参数控制单个任务的执行时间,其中任意一个阈值达到,则任务结束 -- 假设一行数据 500B,希望每 100MB 或 10 秒为一个 task。100MB 的预期处理时间是 10-20 秒,对应的行数约为 200000 行 "max_batch_interval" = "10", -- 每个子任务最大执行时间 "max_batch_rows" = "200000", -- max_batch_rows 用于记录从 Kafka 中读取到的数据行数 "max_batch_size" = "104857600", -- max_batch_size 用于记录从 Kafka 中读取到的数据量,单位是字节。 "max_error_number"="1000", -- 允许错误行数 "json_root" = "$.data", -- 获取data下的数据 "strip_outer_array" = "true" -- 因为 $.data下的数据是一个 JSONArry 数组 [{}],所以进行数据展平 ) FROM KAFKA ( -- Kafka信息 "kafka_broker_list"= "10.xx.xx.xx:9092", -- 节点 "kafka_topic" = "gxy_student", -- topic名称 "kafka_partitions" = "0", "kafka_offsets" = "OFFSET_END" );

the data from kafka is imported but the error is still there.

Anything Else?

none

Are you willing to submit PR?

Code of Conduct

tempestLXC commented 1 year ago

我也遇到同样的问题了 之前以为是在binlog同步时候出现, 现在发现这个错误会一直报