liuxinwang / go-mysql-starrocks

mysql to starrocks|doris sync
GNU General Public License v2.0
24 stars 7 forks source link

[error] api.go:464 handling execute select streaming failed. err: io.CopyN failed. err unexpected EOF, copied 0, expected 615: connection was bad #20

Closed running-db closed 9 months ago

running-db commented 9 months ago

v0.5.2同步一个大表,从mysql8.0.22到starrocks 3.2.2, 大约400多万数据,老是报以下的一个error错误。有时候几万条就报,最后这一次运行了一个多小时,同步了300多万数据也报错了。

程序报错日志: [2024/01/31 23:56:04] [info] api.go:81 add schema table meta data: ces.ces_delivery_goods [2024/01/31 23:56:04] [info] api.go:95 add rule map: {"RuleType":"dynamic add","full_sync":true,"source-schema":"db01","source-table":"t_delivery_goods","target-schema":"p01","target-table":"t_delivery_goods"} [2024/01/31 23:56:04] [info] api.go:108 add rule includeTableRegex: ^db01.t_delivery_goods$ [2024/01/31 23:56:04] [info] api.go:109 add rule successfully [2024/01/31 23:56:04] [info] api.go:114 pause output write [2024/01/31 23:56:04] [info] api.go:320 start handle full data sync... [2024/02/01 01:07:27] [error] api.go:464 handling execute select streaming failed. err: io.CopyN failed. err unexpected EOF, copied 0, expected 615: connection was bad [2024/02/01 01:07:27] [info] api.go:144 resume output write

curl添加全量同步那的报错: result: add rule full sync handle failed err: io.CopyN failed. err unexpected EOF, copied 0, expected 615: connection was bad, full sync rows: 3414723

liuxinwang commented 9 months ago

我测试下,实时同步是否正常?

running-db commented 9 months ago

实时同步正常,只是没有已存在的数据,是新增的才有。我还有一个100多万数据的表都同步成功了. mysql那边与连接timeout相关参数: | interactive_timeout | 28800 | | wait_timeout | 28800 |

running-db commented 9 months ago

还存在个问题,ctrl+c 终止程序有时候终止不了 image

liuxinwang commented 9 months ago

net_write_timeout 这个参数看一下,是不是mysql读取太慢了,又或者sr写入太慢?正常400w应该半个小时内可以同步完,mysql表结构提供下

running-db commented 9 months ago

| net_write_timeout | 60 |

Create Table: CREATE TABLE `t_delivery_goods` (
  `dg_id` bigint NOT NULL AUTO_INCREMENT,
  `order_id` varchar(32) NOT NULL,
  `delivery_id` bigint NOT NULL,
  `mer_no` varchar(32) NOT NULL,
  `pmer_id` bigint DEFAULT NULL,
  `spl_no` varchar(32) NOT NULL,
  `mer_order_no` varchar(32) NOT NULL,
  `sku_id` varchar(32) NOT NULL ,
  `buy_num` int NOT NULL COMMENT ,
  `mer_goods_amount` decimal(20,5) NOT NULL DEFAULT '0.00000',
  `goods_cost_amount` decimal(20,5) NOT NULL DEFAULT '0.00000' ,
  `mer_sku_no` varchar(32) DEFAULT NULL ,
  `mer_sub_order_no` varchar(32) DEFAULT NULL ,
  `spl_sup_order_no` varchar(64) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci DEFAULT NULL ,
  `delivery_status` int NOT NULL DEFAULT '10' ,
  `has_refunded` int NOT NULL DEFAULT '1' ,
  `create_time` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP ,
  `delivery_time` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP ,
  `finish_time` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP ,
  `notify_url` varchar(32) DEFAULT NULL ,
  `notify_status` tinyint NOT NULL DEFAULT '10' ,
  `max_count` tinyint NOT NULL DEFAULT '10' ,
  `notify_count` tinyint NOT NULL DEFAULT '0' ,
  `start_time` datetime DEFAULT NULL ,
  `end_time` datetime DEFAULT NULL ,
  `notify_msg` varchar(256) DEFAULT NULL ,
  `refund_id` bigint NOT NULL DEFAULT '0' ,
  `mer_product_id` bigint NOT NULL DEFAULT '0' ,
  `zone_id` int NOT NULL DEFAULT '0' ,
  `spl_shelf_id` int NOT NULL,
  `spl_spu_id` bigint NOT NULL DEFAULT '0' ,
  `spu_id` varchar(32) NOT NULL DEFAULT '-' ,
  `spl_sku_id` bigint NOT NULL DEFAULT '0' ,
  `spl_delivery_id` bigint NOT NULL DEFAULT '0',
  `face` decimal(20,5) NOT NULL ,
  `total_face` decimal(20,5) NOT NULL ,
  `mer_goods_price` decimal(20,5) NOT NULL DEFAULT '0.00000' ,
  `mer_pay_amount` decimal(20,5) NOT NULL DEFAULT '0.00000' ,
  `mer_deduct_amount` decimal(20,5) NOT NULL DEFAULT '0.00000' ,
  `goods_cost_price` decimal(20,5) NOT NULL DEFAULT '0.00000' ,
  `refund_num` bigint NOT NULL DEFAULT '0' ,
  `goods_refund_face` decimal(20,5) NOT NULL DEFAULT '0.00000' ,
  `goods_refund_amount` decimal(20,5) NOT NULL DEFAULT '0.00000' ,
  `pay_refund_amount` decimal(20,5) NOT NULL DEFAULT '0.00000' ,
  `deduct_refund_amount` decimal(20,5) NOT NULL DEFAULT '0.00000' ,
  `refund_cost_amount` decimal(20,5) NOT NULL DEFAULT '0.00000',
  `tax_rate` decimal(20,5) NOT NULL DEFAULT '0.00000' ,
  `aftersales_dealing_num` bigint NOT NULL DEFAULT '0' ,
  `aftersales_success_num` bigint NOT NULL DEFAULT '0' ,
  `aftersales_refund_num` bigint NOT NULL DEFAULT '0' ,
  `delivery_type` int DEFAULT '1' ,
  `need_offset` int NOT NULL DEFAULT '1' ,
  `corp_settle_amount` decimal(20,5) NOT NULL DEFAULT '0.00000' ,
  `corp_settle_status` int NOT NULL DEFAULT '10',
  `corp_id` int DEFAULT NULL ,
  `corp_settle_num` bigint NOT NULL DEFAULT '0' ,
  `corp_verify_status` int NOT NULL DEFAULT '10' ,
  `corp_settle_differ` decimal(20,5) NOT NULL DEFAULT '0.00000' ,
  `plain_cost_price` decimal(20,5) DEFAULT NULL,
  `spl_verify_status` int NOT NULL DEFAULT '10',
  `spl_settle_status` int NOT NULL DEFAULT '10' ,
  `spl_need_offset` int NOT NULL DEFAULT '1' ,
  `spl_settle_amount` decimal(20,5) NOT NULL DEFAULT '0.00000' ,
  `spl_settle_differ_amount` decimal(20,5) NOT NULL DEFAULT '0.00000',
  `spl_settle_num` bigint NOT NULL DEFAULT '0' ,
  `spl_settle_differ_num` bigint NOT NULL DEFAULT '0' ,
  `deduct_title` varchar(512) DEFAULT NULL,
  `deduct_ic_id` bigint DEFAULT NULL ,
  `scorp_id` int DEFAULT NULL ,
  `logistics_amount` decimal(20,5) NOT NULL DEFAULT '0.00000',
  `deduct_logistic_amount` decimal(20,5) DEFAULT '0.00000',
  `pay_logistic_amount` decimal(20,5) DEFAULT '0.00000' ,
  `refund_logistic_amount` decimal(20,5) DEFAULT '0.00000' ,
  `refund_pay_logistic_amount` bigint NOT NULL DEFAULT '0' ,
  `refund_deduct_logistic_amount` bigint NOT NULL DEFAULT '0' ,
  `remark` text ,
  `spl2_spu_id` varchar(32) DEFAULT NULL ,
  `spl2_sku_id` varchar(32) DEFAULT NULL ,
  `spl2_sale_price` decimal(10,3) DEFAULT NULL ,
  `spl2_name` varchar(32) DEFAULT NULL ,
  `spl2_order_no` varchar(32) DEFAULT NULL ,
  `spl2_sku_order_no` varchar(32) DEFAULT NULL ,
  `spl2_no` varchar(32) DEFAULT NULL ,
  `voucher_amount` decimal(20,5) DEFAULT '0.00000' ,
  `bearer` int DEFAULT NULL,
  `other_sku_id` int DEFAULT NULL ,
  PRIMARY KEY (`dg_id`)
) ENGINE=InnoDB AUTO_INCREMENT=222102105846758 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci
liuxinwang commented 9 months ago

我本地docker环境测试是没有问题的:

mysql> select version();
+-----------+
| version() |
+-----------+
| 8.0.22    |
+-----------+
1 row in set (0.01 sec)

mysql> select current_version();
+-------------------+
| current_version() |
+-------------------+
| 3.2.2-269e832     |
+-------------------+
1 row in set (0.04 sec)

go-mysql-sr log: -- 500w, 用时 3min
2024-02-01 13:55:11 [2024/02/01 13:55:11] [info] api.go:81 add schema table meta data: sysbenchts.sbtest1
2024-02-01 13:55:11 [2024/02/01 13:55:11] [info] api.go:95 add rule map: {"RuleType":"dynamic add","full_sync":true,"source-schema":"sysbenchts","source-table":"sbtest1","target-schema":"sysbenchts","target-table":"sbtest1"}
2024-02-01 13:55:11 [2024/02/01 13:55:11] [info] api.go:108 add rule includeTableRegex: ^sysbenchts\.sbtest1$
2024-02-01 13:55:11 [2024/02/01 13:55:11] [info] api.go:109 add rule successfully
2024-02-01 13:55:11 [2024/02/01 13:55:11] [info] api.go:114 pause output write
2024-02-01 13:55:11 [2024/02/01 13:55:11] [info] api.go:320 start handle full data sync...
2024-02-01 13:58:08 [2024/02/01 13:58:08] [info] api.go:478 full data sync total rows: 5000000
2024-02-01 13:58:08 [2024/02/01 13:58:08] [info] api.go:483 close conn
2024-02-01 13:58:08 [2024/02/01 13:58:08] [info] api.go:485 end handle full data sync
2024-02-01 13:58:08 [2024/02/01 13:58:08] [info] api.go:158 resume output write

检查下你的网络环境或者服务器性能,或者换个环境再试试看

running-db commented 9 months ago

又重新同步了一次,观察了下10秒左右1万条。之前还跨机房同步过上千万的数据也没遇到这个报错,而且之前还是hdd的磁盘,这次数据库和starrocks都在ssd上,且是内网,能通过什么参数规避这个不

running-db commented 9 months ago

另外咨询个问题,这个拉取binlog不需要配置server_id的么?

liuxinwang commented 9 months ago
  1. 正常应该每秒能到几万这样,有做grafana+prometheus监控吗?可以通过监控看看是读取慢 还是 写入慢
  2. server_id是自动生成的,范围在1000 - 2000之间
running-db commented 9 months ago

server_id能否设置可配置,怕跟内部复制的server_id冲突了。程序没做监控,mysql和sr那边io和cpu都很低。这一次已经同步到400多万了,有点希望会同步完。

liuxinwang commented 9 months ago
  1. 通过你的表结构测试是比较慢,qps大概在1500左右,可能是跟表字段较多有关系,我看看怎么优化下
  2. server_id可配置后面增强支持下
running-db commented 9 months ago

感谢

running-db commented 9 months ago

我把导入程序监控配置上了,输出如下:为啥读取延迟上小时了?是因为我pause了的原因么?我在全量导入其他表,就把整体给pause了。另外延迟消息数是负的,这个没看懂。是不是这些只统计在线同步的?全量同步的数据没有写入这些metric image

running-db commented 9 months ago

还发现一个疑问,连接sr的会话,一直是sleep状态,持续3个小时了,现在是开启了同步复制的。我看sr参数,如果sleep状态超过8小时,会被断开。写入数据,不是通过这个会话的么 image

liuxinwang commented 9 months ago
  1. metric 负值优化后修改
  2. sr的会话sleep状态问题,sr写入是通过http stream load操作,所以不用担心。
liuxinwang commented 9 months ago

参考v0.5.3版本