matrixorigin / matrixone

Hyperconverged cloud-edge native database
https://docs.matrixorigin.cn/en
Apache License 2.0
1.77k stars 275 forks source link

[Bug]: run tpcc test and then restore report Duplicate entry #16100

Closed YANGGMM closed 4 months ago

YANGGMM commented 5 months ago

Is there an existing issue for the same bug?

Branch Name

main

Commit ID

newest

Other Environment Information

- Hardware parameters:
- OS type:
- Others:

Actual Behavior


mysql> restore account sys from snapshot sp_01;
ERROR 1062 (HY000): Duplicate entry '1' for key 'hist_id'

mysql> restore account sys database tpcc from snapshot sp_01;
ERROR 1062 (HY000): Duplicate entry '8193' for key 'hist_id'

Expected Behavior

No response

Steps to Reproduce

1.create snapshot 

2.run tpcc test

3.restore

Additional information

No response

YANGGMM commented 5 months ago

尝试复现

mysql> show create table bmsql_history ;
+---------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table         | Create Table                                                                                                                                                                                                                                                                                                                                         |
+---------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| bmsql_history | CREATE TABLE `bmsql_history` (
`hist_id` INT NOT NULL AUTO_INCREMENT,
`h_c_id` INT DEFAULT NULL,
`h_c_d_id` INT DEFAULT NULL,
`h_c_w_id` INT DEFAULT NULL,
`h_d_id` INT DEFAULT NULL,
`h_w_id` INT DEFAULT NULL,
`h_date` TIMESTAMP DEFAULT NULL,
`h_amount` DECIMAL(6,2) DEFAULT NULL,
`h_data` VARCHAR(24) DEFAULT NULL,
PRIMARY KEY (`hist_id`)
) |
+---------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

mysql> create table `bmsql_history_restore`(`hist_id` INT NOT NULL AUTO_INCREMENT,`h_c_id` INT DEFAULT NULL,`h_c_d_id` INT DEFAULT NULL,`h_c_w_id` INT DEFAULT NULL,`h_d_id` INT DEFAULT NULL,`h_w_id` INT DEFAULT NULL,`h_date` TIMESTAMP DEFAULT NULL,`h_amount` DECIMAL(6,2) DEFAULT NULL,`h_data` VARCHAR(24) DEFAULT NULL,PRIMARY KEY (`hist_id`));
Query OK, 0 rows affected (0.07 sec)

mysql> insert into `bmsql_history_restore` select * from `bmsql_history` {snapshot = 'sp_01'};
Query OK, 300000 rows affected (3.27 sec)

恢复成功,而且数据量正确

YANGGMM commented 5 months ago

用restore 报错

mysql> restore account sys database tpcc table `bmsql_history` from snapshot sp_01;
ERROR 1062 (HY000): Duplicate entry '8193' for key 'hist_id'
YANGGMM commented 5 months ago
image
YANGGMM commented 5 months ago

测试数据量最大的表restore,正常恢复

mysql> restore account sys database tpcc table bmsql_order_line from snapshot sp_01;
Query OK, 0 rows affected (12 min 57.49 sec)

恢复后数据一切正常

mysql> select count(*) from bmsql_order_line;
+----------+
| count(*) |
+----------+
|  2998810 |
+----------+
1 row in set (0.04 sec)

mysql> select count(*) from bmsql_order_line{snapshot = 'sp_01'};
+----------+
| count(*) |
+----------+
|  2998810 |
+----------+
1 row in set (0.04 sec)
YANGGMM commented 5 months ago

mysql>  restore account sys database tpcc table `bmsql_history` from snapshot sp_01;
ERROR 1062 (HY000): Duplicate entry '16385' for key 'hist_id'
mysql> 
{"level":"ERROR","time":"2024/05/15 11:08:30.914112 +0800","caller":"fuzzyfilter/filter.go:275","msg":"error: Duplicate entry '16385' for key 'hist_id'","span":{"trace_id":"3df833af-1e84-c416-e634-d8f9cc4a1ed0","span_id":"d03e90d309a9ddaf"}}
YANGGMM commented 5 months ago
cerate database tpcc;
use tpcc;
create table `bmsql_history` (`hist_id` INT NOT NULL AUTO_INCREMENT,`h_c_id` INT DEFAULT NULL,`h_c_d_id` INT DEFAULT NULL,`h_c_w_id` INT DEFAULT NULL,`h_d_id` INT DEFAULT NULL,`h_w_id` INT DEFAULT NULL,`h_date` TIMESTAMP DEFAULT NULL,`h_amount` DECIMAL(6,2) DEFAULT NULL,`h_data` VARCHAR(24) DEFAULT NULL,PRIMARY KEY (`hist_id`));

--load file tpcccust-hist.csv


create snapshot sp_01 for account sys;

 restore account sys database tpcc table `bmsql_history` from snapshot sp_01;
jensenojs commented 5 months ago
mysql>  restore account sys database tpcc table `bmsql_history` from snapshot sp_01;
ERROR 1062 (HY000): Duplicate entry '16385' for key 'hist_id'
mysql> 
{"level":"ERROR","time":"2024/05/15 11:08:30.914112 +0800","caller":"fuzzyfilter/filter.go:275","msg":"error: Duplicate entry '16385' for key 'hist_id'","span":{"trace_id":"3df833af-1e84-c416-e634-d8f9cc4a1ed0","span_id":"d03e90d309a9ddaf"}}
  1. 和钊哥一起看了一下, fuzzy 的table scan不应该读取上来数据, 但读上来了.

  2. 在后台SQL的执行过程中用select count(*) from tblName试了一下, 没有读取上来数据

Ariznawlll commented 5 months ago

情景一: 单机测试了sysbench 点查 commit:16a954e8eaa0c37f36c1fd6cc01691511758b69f 数据量:10w 测试步骤1:

sys租户连接mo:
create database sysbench_db;

本地:git clone https://github.com/matrixorigin/mo-sysbench.git

cd mo-sysbench
sysbench  --mysql-host=127.0.0.1 --mysql-port=6001 --mysql-user=dump --mysql-password=111 oltp_point_select.lua --mysql-db=sysbench_db --tables=10 --table_size=100000 --threads=100 --time=30 --report-interval=10    cleanup

建db并插入数据
sysbench  --mysql-host=127.0.0.1 --mysql-port=6001 --mysql-user=dump --mysql-password=111 oltp_point_select.lua --mysql-db=sysbench_db --tables=10 --table_size=100000 --threads=100 --time=30 --report-interval=10    prepare

sys租户连接mo:
create snapshot sp01 for account sys;

执行点查测试
cd mo-sysbench
sysbench  --mysql-host=127.0.0.1 --mysql-port=6001 --mysql-user=dump --mysql-password=111   oltp_point_select.lua --mysql-db=sysbench_db --tables=10 --table_size=100000 --threads=100 --time=300 --report-interval=10    --range_selects=off --point_selects=1 run

sys租户连接mo:
restore account sys from snapshot sp01;

表schema:

image

restore结果:

image

测试步骤2:

sys租户连接mo:
create database sysbench_db;

本地:git clone https://github.com/matrixorigin/mo-sysbench.git

cd mo-sysbench
sysbench  --mysql-host=127.0.0.1 --mysql-port=6001 --mysql-user=dump --mysql-password=111 oltp_point_select.lua --mysql-db=sysbench_db --tables=10 --table_size=100000 --threads=100 --time=30 --report-interval=10    cleanup

建db并插入数据
sysbench  --mysql-host=127.0.0.1 --mysql-port=6001 --mysql-user=dump --mysql-password=111 oltp_point_select.lua --mysql-db=sysbench_db --tables=10 --table_size=100000 --threads=100 --time=30 --report-interval=10    prepare

sys租户连接mo:
create snapshot sp01 for account sys;

什么测试也不执行直接执行下面的restore

sys租户连接mo:
restore account sys from snapshot sp01;

表schema:

image image

情景二: 单机执行tpcc 10-1000测试:

建表:https://github.com/matrixorigin/mo-tpcc/blob/main/sql/tableCreates.sql
mysql -h 127.0.0.1 -P 6001 -udump -p111
create snapshot sp03 for account sys;

执行tpcc 10-1000测试
restore account sys from snapshot sp03;
image

表schema:

image
YANGGMM commented 5 months ago

单独开事务执行是没有问题的

mysql> begin;
Query OK, 0 rows affected (0.00 sec)

mysql> drop table if exists `bmsql_history`;
Query OK, 0 rows affected (0.01 sec)

mysql> use tpcc;
Database changed
mysql> 
mysql> create table `bmsql_history` (`hist_id` INT NOT NULL AUTO_INCREMENT,`h_c_id` INT DEFAULT NULL,`h_c_d_id` INT DEFAULT NULL,`h_c_w_id` INT DEFAULT NULL,`h_d_id` INT DEFAULT NULL,`h_w_id` INT DEFAULT NULL,`h_date` TIMESTAMP DEFAULT NULL,`h_amount` DECIMAL(6,2) DEFAULT NULL,`h_data` VARCHAR(24) DEFAULT NULL,PRIMARY KEY (`hist_id`));
Query OK, 0 rows affected (0.02 sec)

mysql> insert into `tpcc`.`bmsql_history` SELECT * FROM `tpcc`.`bmsql_history` {snapshot = 'sp_01'};
Query OK, 300000 rows affected (1.93 sec)

mysql> commit;
Query OK, 0 rows affected (0.48 sec)
YANGGMM commented 5 months ago
image
jensenojs commented 5 months ago
企业微信截图_5145987b-4371-4b75-a315-01ec85f513b4

mo-service-backend.log mo-service-frontend.log

分别打印了frontend/backend执行的snapshot进行对比, 发现frontend比backend多了点东西, 需要向相关同事请教一下是什么, 打印日志的基本格式如下所示

// func restoreToDatabaseOrTable
    if tblName != "" {
        if b, _  := bh.(*backExec); b != nil {
            fmt.Println("---")
            fmt.Println("drop table")   
            fmt.Println(b.backSes.txnHandler.txnOp.GetOverview().Meta.SnapshotTS)
            fmt.Println("---")
        }
        if err = bh.Exec(toCtx, "drop table if exists "+dbName+"."+tblName); err != nil {
            return
        }
    }

单独开事务执行是没有问题的

mysql> begin;
Query OK, 0 rows affected (0.00 sec)

mysql> drop table if exists `bmsql_history`;
Query OK, 0 rows affected (0.01 sec)

mysql> use tpcc;
Database changed
mysql> 
mysql> create table `bmsql_history` (`hist_id` INT NOT NULL AUTO_INCREMENT,`h_c_id` INT DEFAULT NULL,`h_c_d_id` INT DEFAULT NULL,`h_c_w_id` INT DEFAULT NULL,`h_d_id` INT DEFAULT NULL,`h_w_id` INT DEFAULT NULL,`h_date` TIMESTAMP DEFAULT NULL,`h_amount` DECIMAL(6,2) DEFAULT NULL,`h_data` VARCHAR(24) DEFAULT NULL,PRIMARY KEY (`hist_id`));
Query OK, 0 rows affected (0.02 sec)

mysql> insert into `tpcc`.`bmsql_history` SELECT * FROM `tpcc`.`bmsql_history` {snapshot = 'sp_01'};
Query OK, 300000 rows affected (1.93 sec)

mysql> commit;
Query OK, 0 rows affected (0.48 sec)
jensenojs commented 5 months ago
企业微信截图_5145987b-4371-4b75-a315-01ec85f513b4

mo-service-backend.log mo-service-frontend.log

分别打印了frontend/backend执行的snapshot进行对比, 发现frontend比backend多了点东西, 需要向相关同事请教一下是什么, 打印日志的基本格式如下所示

// func restoreToDatabaseOrTable
  if tblName != "" {
      if b, _  := bh.(*backExec); b != nil {
          fmt.Println("---")
          fmt.Println("drop table")   
          fmt.Println(b.backSes.txnHandler.txnOp.GetOverview().Meta.SnapshotTS)
          fmt.Println("---")
      }
      if err = bh.Exec(toCtx, "drop table if exists "+dbName+"."+tblName); err != nil {
          return
      }
  }

单独开事务执行是没有问题的

mysql> begin;
Query OK, 0 rows affected (0.00 sec)

mysql> drop table if exists `bmsql_history`;
Query OK, 0 rows affected (0.01 sec)

mysql> use tpcc;
Database changed
mysql> 
mysql> create table `bmsql_history` (`hist_id` INT NOT NULL AUTO_INCREMENT,`h_c_id` INT DEFAULT NULL,`h_c_d_id` INT DEFAULT NULL,`h_c_w_id` INT DEFAULT NULL,`h_d_id` INT DEFAULT NULL,`h_w_id` INT DEFAULT NULL,`h_date` TIMESTAMP DEFAULT NULL,`h_amount` DECIMAL(6,2) DEFAULT NULL,`h_data` VARCHAR(24) DEFAULT NULL,PRIMARY KEY (`hist_id`));
Query OK, 0 rows affected (0.02 sec)

mysql> insert into `tpcc`.`bmsql_history` SELECT * FROM `tpcc`.`bmsql_history` {snapshot = 'sp_01'};
Query OK, 300000 rows affected (1.93 sec)

mysql> commit;
Query OK, 0 rows affected (0.48 sec)

从日志看, 不管是前端执行还是后端执行, snapshot都不止一个. 都是有问题的

YANGGMM commented 4 months ago
mysql> begin;
Query OK, 0 rows affected (0.01 sec)

mysql> 
mysql> drop table if exists `bmsql_history`;
Query OK, 0 rows affected (0.03 sec)

mysql> use tpcc;
Database changed
mysql> 
mysql> show full tables from `tpcc` {snapshot = 'sp_01'};
+------------------+------------+
| Tables_in_tpcc   | Table_type |
+------------------+------------+
| bmsql_config     | BASE TABLE |
| bmsql_customer   | BASE TABLE |
| bmsql_district   | BASE TABLE |
| bmsql_history    | BASE TABLE |
| bmsql_item       | BASE TABLE |
| bmsql_new_order  | BASE TABLE |
| bmsql_oorder     | BASE TABLE |
| bmsql_order_line | BASE TABLE |
| bmsql_stock      | BASE TABLE |
| bmsql_warehouse  | BASE TABLE |
+------------------+------------+
10 rows in set (0.02 sec)

mysql> show create table `tpcc`.`bmsql_history` {snapshot = 'sp_01'};
+---------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table         | Create Table                                                                                                                                                                                                                                                                                                                                         |
+---------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| bmsql_history | CREATE TABLE `bmsql_history` (
`hist_id` INT NOT NULL AUTO_INCREMENT,
`h_c_id` INT DEFAULT NULL,
`h_c_d_id` INT DEFAULT NULL,
`h_c_w_id` INT DEFAULT NULL,
`h_d_id` INT DEFAULT NULL,
`h_w_id` INT DEFAULT NULL,
`h_date` TIMESTAMP DEFAULT NULL,
`h_amount` DECIMAL(6,2) DEFAULT NULL,
`h_data` VARCHAR(24) DEFAULT NULL,
PRIMARY KEY (`hist_id`)
) |
+---------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.01 sec)

mysql> CREATE TABLE `bmsql_history` (
    -> `hist_id` INT NOT NULL AUTO_INCREMENT,
    -> `h_c_id` INT DEFAULT NULL,
    -> `h_c_d_id` INT DEFAULT NULL,
    -> `h_c_w_id` INT DEFAULT NULL,
    -> `h_d_id` INT DEFAULT NULL,
    -> `h_w_id` INT DEFAULT NULL,
    -> `h_date` TIMESTAMP DEFAULT NULL,
    -> `h_amount` DECIMAL(6,2) DEFAULT NULL,
    -> `h_data` VARCHAR(24) DEFAULT NULL,
    -> PRIMARY KEY (`hist_id`)
    -> );
Query OK, 0 rows affected (0.03 sec)

mysql> insert into `tpcc`.`bmsql_history` SELECT * FROM `tpcc`.`bmsql_history` {snapshot = 'sp_01'};
Query OK, 300000 rows affected (8.63 sec)

mysql> commit;
Query OK, 0 rows affected (0.48 sec)
YANGGMM commented 4 months ago

在执行restore中insert之前先提交事务,restore就可以顺利通过,感觉是backGroundExec的事务问题

image image
YANGGMM commented 4 months ago

在statement 执行入口添加log

-----------------------
executeStmtWithIncrStmt
sql : restore account sys database tpcc table `bmsql_history` from snapshot sp_01
snapshotTs : 1715844568861921000
-----------------------

-----------------------
executeStmtWithIncrStmt
sql : drop table if exists tpcc.bmsql_history
snapshotTs : 1715844568865944000
-----------------------

-----------------------
executeStmtWithIncrStmt
sql : show full tables from `tpcc` {snapshot = 'sp_01'}
snapshotTs : 1715844568884104000
-----------------------

-----------------------
executeStmtWithIncrStmt
sql : show create table `tpcc`.`bmsql_history` {snapshot = 'sp_01'}
snapshotTs : 1715844568887886000
-----------------------

-----------------------
executeStmtWithIncrStmt
sql : CREATE TABLE `bmsql_history` (
`hist_id` INT NOT NULL AUTO_INCREMENT,
`h_c_id` INT DEFAULT NULL,
`h_c_d_id` INT DEFAULT NULL,
`h_c_w_id` INT DEFAULT NULL,
`h_d_id` INT DEFAULT NULL,
`h_w_id` INT DEFAULT NULL,
`h_date` TIMESTAMP DEFAULT NULL,
`h_amount` DECIMAL(6,2) DEFAULT NULL,
`h_data` VARCHAR(24) DEFAULT NULL,
PRIMARY KEY (`hist_id`)
)
snapshotTs : 1715844568887886000
-----------------------

-----------------------
executeStmtWithIncrStmt
sql : insert into `tpcc`.`bmsql_history` SELECT * FROM `tpcc`.`bmsql_history` {snapshot = 'sp_01'}
snapshotTs : 1715844568903943000
-----------------------
YANGGMM commented 4 months ago

frontend 执行的log

-----------------------
executeStmtWithIncrStmt
sql : drop table if exists tpcc.bmsql_history
snapshotTs : 1715845293989984000
-----------------------

-----------------------
executeStmtWithIncrStmt
sql : show full tables from `tpcc` {snapshot = 'sp_01'}
snapshotTs : 1715845319434256000
-----------------------

-----------------------
executeStmtWithIncrStmt
sql : show create table `tpcc`.`bmsql_history` {snapshot = 'sp_01'}
snapshotTs : 1715845329498667000
-----------------------
-----------------------
executeStmtWithIncrStmt
sql : CREATE TABLE `bmsql_history` ( `hist_id` INT NOT NULL AUTO_INCREMENT, `h_c_id` INT DEFAULT NULL, `h_c_d_id` INT DEFAULT NULL, `h_c_w_id` INT DEFAULT NULL, `h_d_id` INT DEFAULT NULL, `h_w_id` INT DEFAULT NULL, `h_date` TIMESTAMP DEFAULT NULL, `h_amount` DECIMAL(6,2) DEFAULT NULL, `h_data` VARCHAR(24) DEFAULT NULL, PRIMARY KEY (`hist_id`) )
snapshotTs : 1715845333868618000
-----------------------

-----------------------
executeStmtWithIncrStmt
sql : insert into `tpcc`.`bmsql_history` SELECT * FROM `tpcc`.`bmsql_history` {snapshot = 'sp_01'}
snapshotTs : 1715845337754797000
-----------------------
YANGGMM commented 4 months ago

compile阶段的ts 跟 前端的ts 以及 snapshot的ts都对得上

-----------------------
executeStmtWithIncrStmt
sql : insert into `tpcc`.`bmsql_history` SELECT * FROM `tpcc`.`bmsql_history` {snapshot = 'sp_01'}
snapshotTs : 1715847008536497000
-----------------------

-----------------------
compileTableScanDataSource
schemaName : tpcc
relationName : bmsql_history
snapshotTs : 1715847008536497000
-----------------------

-----------------------
compileTableScanDataSource
schemaName : tpcc
relationName : bmsql_history
is snapshot op: true
snapshotTs : 1715846766094095000
-----------------------
image
triump2020 commented 4 months ago

The cause had been located, wait for PR

triump2020 commented 4 months ago
  1. 基本上可以判定自增列有关系, 事务 的 insert into select xxx statement 执行 与 update mo_incremental_columns 并行执行, 每个statement 在 compile.Run前会更新txn.snapshotWriteOffset, 导致了reader 出现了万圣节问题.
  2. 这个问题 跟snaphsot read 无关,只是 snapshot read 更容易暴露问题.
qingxinhome commented 4 months ago

Got it, deal with it as soon as possible

qingxinhome commented 4 months ago

Positioning

qingxinhome commented 4 months ago

问题原因详情已经 FixRestoreDuplicateEntry#16100.zip

确认,请参考文档

qingxinhome commented 4 months ago

pr已提交,待合并

qingxinhome commented 4 months ago

The issue has been resolved, please test and verify. @aressu1985 @TangxingZhou @sukki37 @volgariver6 @triump2020 @daviszhen @ouyuanning @jensenojs @YANGGMM

Ariznawlll commented 4 months ago

commit: 09c4d95dcdc615e1ea924239ff72af8c7863e286

情景1:

sys租户连接mo:
create database sysbench_db;

本地:git clone https://github.com/matrixorigin/mo-sysbench.git

cd mo-sysbench
sysbench  --mysql-host=127.0.0.1 --mysql-port=6001 --mysql-user=dump --mysql-password=111 oltp_point_select.lua --mysql-db=sysbench_db --tables=10 --table_size=100000 --threads=100 --time=30 --report-interval=10    cleanup

建db并插入数据
sysbench  --mysql-host=127.0.0.1 --mysql-port=6001 --mysql-user=dump --mysql-password=111 oltp_point_select.lua --mysql-db=sysbench_db --tables=10 --table_size=100000 --threads=100 --time=30 --report-interval=10    prepare

sys租户连接mo:
create snapshot sp01 for account sys;

执行点查测试
cd mo-sysbench
sysbench  --mysql-host=127.0.0.1 --mysql-port=6001 --mysql-user=dump --mysql-password=111   oltp_point_select.lua --mysql-db=sysbench_db --tables=10 --table_size=100000 --threads=100 --time=300 --report-interval=10    --range_selects=off --point_selects=1 run

sys租户连接mo:
restore account sys from snapshot sp01;
image

情景2:

sys租户连接mo:
create database sysbench_db;

本地:git clone https://github.com/matrixorigin/mo-sysbench.git

cd mo-sysbench
sysbench  --mysql-host=127.0.0.1 --mysql-port=6001 --mysql-user=dump --mysql-password=111 oltp_point_select.lua --mysql-db=sysbench_db --tables=10 --table_size=100000 --threads=100 --time=30 --report-interval=10    cleanup

建db并插入数据
sysbench  --mysql-host=127.0.0.1 --mysql-port=6001 --mysql-user=dump --mysql-password=111 oltp_point_select.lua --mysql-db=sysbench_db --tables=10 --table_size=100000 --threads=100 --time=30 --report-interval=10    prepare

sys租户连接mo:
create snapshot sp01 for account sys;

什么测试也不执行直接执行下面的restore

sys租户连接mo:
restore account sys from snapshot sp01;
image

情景3: create database tpcc; use tpcc; 建表:https://github.com/matrixorigin/mo-tpcc/blob/main/sql/tableCreates.sql

load 数据:

创建快照:create snapshot sp01 for account sys;

本地:git clone https://github.com/matrixorigin/mo-tpcc.git 执行tpcc 10-100测试:./runBenchmark props.mo

连接mo: mysql -h 127.0.0.1 -P 6001 -udump -p111

restore account sys database tpcc from snapshot sp01;

image

测试通过.