apache / doris

Apache Doris is an easy-to-use, high performance and unified analytics database.
https://doris.apache.org
Apache License 2.0
12.54k stars 3.24k forks source link

Routine load丢失数据 #4917

Open sunzhangbin opened 3 years ago

sunzhangbin commented 3 years ago

routine load任务如下:问题是偶尔丢失少量数据,但是重新提交一下此routine load 数据可以补回来。 CREATE ROUTINE LOAD xes1v1_db.ods_xes_platform_order_order_detail_bushu_4 ON ods_xes_platform_order_order_detail COLUMNS( order_id, product_id, promotion_id, id, app_id, user_id, product_type, product_name, product_num, product_price, coupon_price, promotion_price, promotion_type, parent_product_id, parent_product_type, source_id, extras, created_time, updated_time, version, prepaid_card_price, table ), where table regexp 'orderdetail[0-9]' and app_id=8 and created_time>='2020-11-17 00:00:00' PROPERTIES ( "format" = "json", "jsonpaths" = "[ \"$.data.order_id\", \"$.data.product_id\", \"$.data.promotion_id\", \"$.data.id\", \"$.data.app_id\", \"$.data.user_id\", \"$.data.product_type\", \"$.data.product_name\", \"$.data.product_num\", \"$.data.product_price\", \"$.data.coupon_price\", \"$.data.promotion_price\", \"$.data.promotion_type\", \"$.data.parent_product_id\", \"$.data.parent_product_type\", \"$.data.source_id\", \"$.data.extras\", \"$.data.created_time\", \"$.data.updated_time\", \"$.data.version\", \"$.data.prepaid_card_price\", \"$.table\"]" ) FROM KAFKA ( "kafka_broker_list" = "10.20.34.60:9092,10.20.34.62:9092,10.20.34.64:9092", "kafka_topic" = "xes_plarform_order_4", "property.group.id" = "ods_xes_platform_order_order_detail_bushu", "property.client.id" = "ods_xes_platform_order_order_detail_bushu", "property.kafka_default_offsets" = "OFFSET_BEGINNING" );

CREATE TABLE ods_xes_platform_order_order_detail ( order_id varchar(64) NULL DEFAULT "0" COMMENT "订单ID", product_id int(11) NULL DEFAULT "0" COMMENT "商品ID", promotion_id varchar(64) NULL DEFAULT "0" COMMENT "买赠/续报礼包规则id", id varchar(64) NULL COMMENT "id", app_id varchar(64) NULL DEFAULT "0" COMMENT "业务线ID", user_id int(11) NULL DEFAULT "0" COMMENT "用户ID", product_type int(11) NULL DEFAULT "0" COMMENT "商品类别", product_name varchar(255) NULL DEFAULT "" COMMENT "商品名称", product_num int(11) NULL DEFAULT "0" COMMENT "商品数量", product_price int(11) NULL DEFAULT "0" COMMENT "商品销售金额", coupon_price int(11) NULL DEFAULT "0" COMMENT "优惠券分摊金额", promotion_price int(11) NULL DEFAULT "0" COMMENT "促销分摊金额", promotion_type int(11) NULL DEFAULT "0" COMMENT "促销类型", parent_product_id int(11) NULL DEFAULT "0" COMMENT "父商品ID", parent_product_type int(11) NULL DEFAULT "0" COMMENT "父商品类别,业务线可自己定义", source_id varchar(30) NULL DEFAULT "" COMMENT "热点数据", extras varchar(3072) NULL DEFAULT "" COMMENT "订单商品中附属信息存储 比如促销的关键不变更信息存储之类的,不会来查询,不会用来检索", created_time varchar(64) NULL DEFAULT "0000-00-00 00:00:00" COMMENT "创建时间", updated_time varchar(64) NULL DEFAULT "1970-00-00 00:00:00" COMMENT "修改时间", version varchar(64) NULL DEFAULT "" COMMENT "版本控制", prepaid_card_price int(11) NULL DEFAULT "0" COMMENT "礼品卡金额", table varchar(64) NULL DEFAULT "" COMMENT "来源表" ) ENGINE=OLAP UNIQUE KEY(order_id, product_id, promotion_id) COMMENT "网校订单商品表" DISTRIBUTED BY HASH(order_id) BUCKETS 10 PROPERTIES ( "replication_num" = "3", "in_memory" = "false", "storage_format" = "V2" );

stalary commented 3 years ago

+1,我这边也遇到了这个问题,我有离线在线两个集群,总是数据出现不一致情况,重跑rutineload就好了

sunzhangbin commented 3 years ago

+1,我这边也遇到了这个问题,我有离线在线两个集群,总是数据出现不一致情况,重跑rutineload就好了

我启动两个routine load同时同步数据目前还能保证不丢数,但这不是解决根本的方法

stalary commented 3 years ago

@sunzhangbin 开两个有可能会导致时序问题吧

stalary commented 3 years ago

是否和rdKafka中的enable.auto.commit配置有关,我看这个默认是true的

sunzhangbin commented 3 years ago

同时4个routine load job往同一张表同步数据偶尔会丢数据,改成1个routine load job往一个表同步数据没有再发现丢数据的情况?具体原因查不到,日志也看不出来异常。。。。

EmmyMiao87 commented 3 years ago

如果你用unique 模型的话可能是因为不保证顺序,导致数据被replace 了。 routine load如果并发的话不同任务之间的执行是无序的,也就是说你kafka 中的 offset 靠前的数据并不一定先导入。 又由于是unique 模型,后面导入成功的数据会覆盖前面的数据,所以会产生你说的丢数据的错觉。

If you use the unique model, the data may be replaced because the order is not guaranteed. If routine load is concurrent, the execution of different tasks is disordered, which means that the data with the first offset in your Kafka is not necessarily imported first. Also, because it is a unique model, the data that is successfully imported later will overwrite the previous data, so the illusion of data loss will occur.

stalary commented 3 years ago

如果你用unique 模型的话可能是因为不保证顺序,导致数据被replace 了。 routine load如果并发的话不同任务之间的执行是无序的,也就是说你kafka 中的 offset 靠前的数据并不一定先导入。 又由于是unique 模型,后面导入成功的数据会覆盖前面的数据,所以会产生你说的丢数据的错觉。

If you use the unique model, the data may be replaced because the order is not guaranteed. If routine load is concurrent, the execution of different tasks is disordered, which means that the data with the first offset in your Kafka is not necessarily imported first. Also, because it is a unique model, the data that is successfully imported later will overwrite the previous data, so the illusion of data loss will occur.

如果一个routineload的话能保证分区顺序性吗?

sunzhangbin commented 3 years ago

如果你用unique 模型的话可能是因为不保证顺序,导致数据被replace 了。 routine load如果并发的话不同任务之间的执行是无序的,也就是说你kafka 中的 offset 靠前的数据并不一定先导入。 又由于是unique 模型,后面导入成功的数据会覆盖前面的数据,所以会产生你说的丢数据的错觉。 If you use the unique model, the data may be replaced because the order is not guaranteed. If routine load is concurrent, the execution of different tasks is disordered, which means that the data with the first offset in your Kafka is not necessarily imported first. Also, because it is a unique model, the data that is successfully imported later will overwrite the previous data, so the illusion of data loss will occur.

如果一个routineload的话能保证分区顺序性吗?

我用的是uniq,但是确认不是被replace了,丢失的数据整行都不在表里;

MARTIN3242017 commented 1 year ago

请问有后续吗 我也遇到了

fengyang2333 commented 9 months ago

请问有后续吗 我也遇到了

你是什么场景呢?有没啥规律啊

q1051738725 commented 8 months ago

@EmmyMiao87 doris2.0.2 also meet this problem , can we find the root and fix it?