pingcap / tidb

TiDB - the open-source, cloud-native, distributed SQL database designed for modern applications.
https://pingcap.com
Apache License 2.0
37.28k stars 5.84k forks source link

load data commit forced to quit #55204

Open 15767637775 opened 3 months ago

15767637775 commented 3 months ago

Bug Report

1. Minimal reproduce step (Required)

this sql has 2 identical column named 'group_type':

LOAD DATA LOCAL INFILE '/tmp/sql.csv' 
    IGNORE INTO TABLE t_related_group_account 
    FIELDS TERMINATED BY '\t' 
    LINES TERMINATED BY '\n'
    (group_id,group_key,group_type,account_id,account_type,
    market_id,customer_id,one_pass_account,customer_name,
    open_date,certificate_type,certificate_number,branch_id,
    customer_contact,broker_name,broker_contact,branch_name,
    accountsJson,data_status,business_type,permission_id,
    group_type,add_time,add_user,update_time,update_user);

ERROR 1105 (HY000): commit forced to quit

The attachment is here: sql.csv

table schema:

CREATE TABLE `t_related_group_account` (
  `id` bigint(20) NOT NULL AUTO_INCREMENT COMMENT '自增id',
  `group_id` int(11) NOT NULL COMMENT '监控名单组Id',
  `group_key` varchar(250) COLLATE utf8_general_ci DEFAULT NULL COMMENT '多账户客户的时候使用的key',
  `account_id` varchar(150) COLLATE utf8_general_ci DEFAULT NULL COMMENT '账户账户',
  `account_type` int(11) DEFAULT '0' COMMENT '账户类型,详见AccountType',
  `one_pass_account` varchar(50) COLLATE utf8_general_ci DEFAULT NULL COMMENT '一码通账户',
  `market_id` smallint(6) DEFAULT NULL COMMENT '交易所:101沪市,102深市',
  `customer_id` varchar(50) COLLATE utf8_general_ci DEFAULT NULL COMMENT '客户编码',
  `customer_name` varchar(250) COLLATE utf8_general_ci DEFAULT NULL COMMENT '客户名称',
  `open_date` date DEFAULT NULL COMMENT '开户日期',
  `certificate_type` tinyint(1) DEFAULT NULL COMMENT '证件类型:0居民身份证,1军官证,2港澳台通行证,4护照,5其他证件',
  `certificate_number` varchar(50) COLLATE utf8_general_ci DEFAULT NULL COMMENT '证件号码',
  `branch_id` varchar(50) COLLATE utf8_general_ci DEFAULT NULL COMMENT '营业部编码',
  `customer_contact` varchar(50) COLLATE utf8_general_ci DEFAULT NULL COMMENT '客户联系方式',
  `broker_name` varchar(250) COLLATE utf8_general_ci DEFAULT NULL COMMENT '客户经理名称',
  `broker_contact` varchar(50) COLLATE utf8_general_ci DEFAULT NULL COMMENT '客户经理联系方式',
  `branch_name` varchar(250) COLLATE utf8_general_ci DEFAULT NULL COMMENT '营业部名称',
  `accountsJson` json DEFAULT NULL COMMENT '账户信息',
  `data_status` int(1) DEFAULT '0' COMMENT '数据状态:-2:未更新,-1:删除,0:新增,1:已更新',
  `business_type` int(11) DEFAULT '1' COMMENT '账户类型,具体枚举类型见t_dict表的BusinessType枚举',
  `permission_id` varchar(50) COLLATE utf8_general_ci DEFAULT NULL COMMENT '权限Id',
  `group_type` int(1) DEFAULT NULL COMMENT '分类:0多账户客户,1关联账户组',
  `add_time` datetime DEFAULT CURRENT_TIMESTAMP COMMENT '新增时间',
  `add_user` varchar(50) COLLATE utf8_bin DEFAULT NULL COMMENT '新增人',
  `update_time` datetime DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP COMMENT '修改时间',
  `update_user` varchar(50) COLLATE utf8_bin DEFAULT NULL COMMENT '修改人',
  PRIMARY KEY (`id`) /*T![clustered_index] CLUSTERED */
) ;

tidb.log:

[2024/08/05 10:51:36.031 +08:00] [Error] [load_data.go:392] ["commit forced to quit, possible preparation failed"] [conn=5344086106851770779]
[2024/08/05 10:51:36.033 +08:00] [Info] [conn.go:1787] ["draining finished for error"] [conn=5344086106851770779] [error="commit forced to quit"]
[2024/08/05 10:51:36.033 +08:00] [Info] [conn.go:1197] ["command dispatched failed"] [conn=5344086106851770779] [connInfo="id:5344086106851770779, addr:10.129.1.200:52660 status:0, collation:utf8_general_ci, user:archforce"] [command=Query] [status="inTxn:0, autocommit:0"] [sql="LOAD DATA LOCAL INFILE 'sql.csv' IGNORE INTO TABLE t_related_group_account FIELDS TERMINATED BY '\t' LINES TERMINATED BY '\n'(group_id,group_key,group_type,account_id,account_type,market_id,customer_id,one_pass_account,customer_name,open_date,certificate_type,certificate_number,branch_id,customer_contact,broker_name,broker_contact,branch_name,accountsJson,data_status,business_type,permission_id,group_type,add_time,add_user,update_time,update_user)"] [txn_mode=PESSIMISTIC] [timestamp=451628577129758869] [err="commit forced to quit\ngithub.com/pingcap/tidb/executor.(*LoadDataInfo).CommitWork\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/executor/load_data.go:391\ngithub.com/pingcap/tidb/server.(*clientConn).handleLoadData\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/server/conn.go:1767\ngithub.com/pingcap/tidb/server.(*clientConn).handleQuerySpecial\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/server/conn.go:2217\ngithub.com/pingcap/tidb/server.(*clientConn).handleStmt\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/server/conn.go:2196\ngithub.com/pingcap/tidb/server.(*clientConn).handleQuery\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/server/conn.go:1988\ngithub.com/pingcap/tidb/server.(*clientConn).dispatch\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/server/conn.go:1419\ngithub.com/pingcap/tidb/server.(*clientConn).Run\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/server/conn.go:1168\ngithub.com/pingcap/tidb/server.(*Server).onConn\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/server/server.go:692\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_arm64.s:1172"]

2. What did you expect to see? (Required)

This should not cause an error, the data will be successfully imported, just like in MySQL 5.7. tidb v6.5.9 sql_mode is the same as mysql :

ONLY_FULL_GROUP_BY,STRICT_TRANS_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION

3. What did you see instead (Required)

I saw the error: ERROR 1105 (HY000): commit forced to quit.

The error should be "Column 'group_type' specified twice", not "commit forced to quit". There might be an issue somewhere in the process.

4. What is your TiDB version? (Required)

v6.5.9

lance6716 commented 3 months ago

Thanks for reporting. I have checked in v7.1.5 the error message is correct.

mysql> LOAD DATA LOCAL INFILE '/tmp/sql.csv'
    ->     IGNORE INTO TABLE t_related_group_account
    ->     FIELDS TERMINATED BY '\t'
    ->     LINES TERMINATED BY '\n'
    ->     (group_id,group_key,group_type,account_id,account_type,
    ->     market_id,customer_id,one_pass_account,customer_name,
    ->     open_date,certificate_type,certificate_number,branch_id,
    ->     customer_contact,broker_name,broker_contact,branch_name,
    ->     accountsJson,data_status,business_type,permission_id,
    ->     group_type,add_time,add_user,update_time,update_user);
ERROR 1110 (42000): Column 'group_type' specified twice

I'll check MySQL behaviour later.

lance6716 commented 3 months ago

As a workaround, you can change the duplicated column to @dummy, like

LOAD DATA LOCAL INFILE '/tmp/sql.csv' 
    IGNORE INTO TABLE t_related_group_account 
    FIELDS TERMINATED BY '\t' 
    LINES TERMINATED BY '\n'
    (group_id,group_key,group_type,account_id,account_type,
    market_id,customer_id,one_pass_account,customer_name,
    open_date,certificate_type,certificate_number,branch_id,
    customer_contact,broker_name,broker_contact,branch_name,
    accountsJson,data_status,business_type,permission_id,
    @dummy,add_time,add_user,update_time,update_user);