matrixorigin / matrixone

Hyperconverged cloud-edge native database
https://docs.matrixorigin.cn/en
Apache License 2.0
1.77k stars 274 forks source link

[Bug]: performance degradation of insertion #19061

Open jensenojs opened 1 day ago

jensenojs commented 1 day ago

Is there an existing issue for the same bug?

Branch Name

main

Commit ID

lastest

Other Environment Information

- Hardware parameters:
- OS type:
- Others:

Actual Behavior

drop if exists database test; create database test; use test;
CREATE TABLE `metric` (
  `metric_name` varchar(1024) DEFAULT 'sys' COMMENT 'metric name, like: sql_statement_total, server_connections, process_cpu_percent, sys_memory_used, ...',
  `collecttime` datetime(6) NOT NULL COMMENT 'metric data collect time',
  `value` double DEFAULT '0.0' COMMENT 'metric value',
  `node` varchar(1024) DEFAULT 'monolithic' COMMENT 'mo node uuid',
  `role` varchar(1024) DEFAULT 'monolithic' COMMENT 'mo node role, like: CN, DN, LOG',
  `account` varchar(1024) DEFAULT 'sys' COMMENT 'account name',
  `type` varchar(1024) NOT NULL COMMENT 'sql type, like: insert, select, ...'
) COMMENT='metric data[mo_no_del_hint]' CLUSTER BY (`collecttime`, `metric_name`, `account`);

insert into metric select
"metric_name_" || (result % 22),
date_add('2024-08-01 00:00:00', interval cast(result / 1000 as int) SECOND),
  result,
  "node_" ||  (result % 10),
  "role_" || (result % 3),
  "account_" || (result % 100),
  "type_" || (result % 10)
from generate_series(1,1e7) g;

执行如上sql, 运行过程中会failed

[1] + 17948 killed ./mo-service -debug-http :9876 -launch ./etc/launch/launch.toml > 2>&1 ERROR 2013 (HY000): Lost connection to MySQL server during query No connection. Trying to reconnect... ERROR 2003 (HY000): Can't connect to MySQL server on '127.0.0.1:6001' (61) ERROR: Can't connect to the server

mysql>


企业微信截图_152dee27-1404-468f-9c9e-c7d1c1767af1

@Ariznawlll 辛苦QA同学补充相关现象

Expected Behavior

No response

Steps to Reproduce

drop if exists database test; create database test; use test;
CREATE TABLE `metric` (
  `metric_name` varchar(1024) DEFAULT 'sys' COMMENT 'metric name, like: sql_statement_total, server_connections, process_cpu_percent, sys_memory_used, ...',
  `collecttime` datetime(6) NOT NULL COMMENT 'metric data collect time',
  `value` double DEFAULT '0.0' COMMENT 'metric value',
  `node` varchar(1024) DEFAULT 'monolithic' COMMENT 'mo node uuid',
  `role` varchar(1024) DEFAULT 'monolithic' COMMENT 'mo node role, like: CN, DN, LOG',
  `account` varchar(1024) DEFAULT 'sys' COMMENT 'account name',
  `type` varchar(1024) NOT NULL COMMENT 'sql type, like: insert, select, ...'
) COMMENT='metric data[mo_no_del_hint]' CLUSTER BY (`collecttime`, `metric_name`, `account`);

insert into metric select
"metric_name_" || (result % 22),
date_add('2024-08-01 00:00:00', interval cast(result / 1000 as int) SECOND),
  result,
  "node_" ||  (result % 10),
  "role_" || (result % 3),
  "account_" || (result % 100),
  "type_" || (result % 10)
from generate_series(1,1e7) g;

***

以及相关的大数据测试

Additional information

No response

jensenojs commented 1 day ago

在我给出的相关复现sql中, 观测到后台SQL的执行次数是有些问题的. 得看一下

image
jensenojs commented 1 day ago

需要二分的范围

b59fadc5f add data key table (#18893)
9e563d4fc add object sinker for gc, checkpoint, transfer etc. (#18898)
e0114d116 [bug] proxy: session does not transfer after COM_STMT_CLOSE. (#18614)
0e0c081f8 fix race (#18918)
759a66efb Revert "tune ap memory cache policy (#18852)" (#18915)
be8b64e57 choose the most selective index to get better performance (#18907)
52f440751 increase the priority of existing txn (#18619)
a087803b1 refactor: externalscan compile (#18827)
e0ec4ef3e fix connection string parsing (#18906)
1f17a1407 remove DispatchNotifyCh and vectorPool from process. (#18888)
66d31b2e7 fix bug: do not clean data in group.Reset (#18885)
3f3788c4e fix Cpu Time statistics value less then 0 (#18838)
1dfe2ecf5 optimize search in json (#18884)
fcacfc58f [enhancement] clusterservice: add regexp cache (#17967)
e1f1b09a8 ignore the blk info when commit data object to tn (step 1) (#18858)
6f0808748 add protobuf messages for MULTI_UPDATE (#18656)
e0eb7d6da [Improvement] : turn on dynamic folding of executor (#18887)
d852c27ac external improvement (#18857)
745116dc3 Refactor merge task host (#18883)
06cd080dd remove catalog operation log from checkpoint and replay TN from the three-table data (#18578)
949c826ba [bug] fix ut TestNodeGossip (#18882)
d22360d7c [#15201]Remove BVT tag (#18876)
514254474 tune ap memory cache policy (#18852)
76ea1a2b7 Add multi update operator (#18845)
8e3fffacb remove InefficientMustStrCol in merge&sort. (#18868)
0010ae8f0 add simple object reader for s3 transfer and gc optimization (#18867)
edca9ef16 (origin/tpcc, tpcc) [fix] : add DebugTool for executor, add test cases, fix some bugs (#18828)
50f45ea9f fix TestChangesHandle3 (#18861)
c87bcdaa1 mo-service: fix go build tag (#18851)
786f7bb50 add case for datalink (#18846)
ab7363aba Add policyCompact (#18720)
f666bc0ed fix error info (#18836)
6141ccae3 fix ndv calculation to make it more accurate (#18847)
afd7c5337 Fix ckp (#18825)
ff41a9b5c fix sql generation of show cdc task (#18830)
c678c5a69 delete time Index for tombstone objects in partition state. (#18832)
4be8e6312 make internal sql executor support partition table (#18841)
717cb94ed add MergeSortBatches function for future code reuse (#18840)
11e40a926 fix stats for scan with limit (#18824)
4e1a2f88a Adding a jsonbyte fulltext tokenizer. (#18740)
Ariznawlll commented 1 day ago

insert into select性能下降:

出现问题的commit在 06cd080dd6c3384d1a19f4de6a0083e152f44b96和9abebe71d0d5059d21466083a1077fdcfe4d397e之间

image

测试场景1:

create table  if not exists big_data_test.table_with_pk_for_load_1B( id bigint primary key, col1 tinyint, col2 smallint, col3 int, col4 bigint, col5 tinyint unsigned, col6 smallint unsigned, col7 int unsigned, col8 bigint unsigned, col9 float, col10 double, col11 varchar(255), col12 Date, col13 DateTime, col14 timestamp, col15 bool, col16 decimal(16,6), col17 text, col18 json, col19 blob, col20 binary(255), col21 varbinary(255), col22 vecf32(3), col23 vecf32(3), col24 vecf64(3), col25 vecf64(3) );

load data url s3option {'endpoint'='http://cos.ap-guangzhou.myqcloud.com/','access_key_id'='***','secret_access_key'='***','bucket'='mo-load-guangzhou-1308875761', 'filepath'='mo-big-data/1000000000_20_columns_load_data_pk.csv'} into table big_data_test.table_with_pk_for_load_1B fields terminated by '|' lines terminated by '\n' ignore 1 lines parallel 'true';

insert into big_data_test.table_with_pk_index_for_insert_1B(id,col1,col2,col3,col4,col5,col6,col7,col8,col9,col10,col11,col12,col13,col14,col15,col16,col17,col18,col19,col20,col21,col22,col23,col24,col25) select * from big_data_test.table_with_pk_for_load_1B where col4 != -7508478199581380391;

测试场景2:

create table  if not exists big_data_test.table_with_pk_for_load_1B( id bigint primary key, col1 tinyint, col2 smallint, col3 int, col4 bigint, col5 tinyint unsigned, col6 smallint unsigned, col7 int unsigned, col8 bigint unsigned, col9 float, col10 double, col11 varchar(255), col12 Date, col13 DateTime, col14 timestamp, col15 bool, col16 decimal(16,6), col17 text, col18 json, col19 blob, col20 binary(255), col21 varbinary(255), col22 vecf32(3), col23 vecf32(3), col24 vecf64(3), col25 vecf64(3) );

load data url s3option {'endpoint'='http://cos.ap-guangzhou.myqcloud.com/','access_key_id'='***','secret_access_key'='***','bucket'='mo-load-guangzhou-1308875761', 'filepath'='mo-big-data/1000000000_20_columns_load_data_pk.csv'} into table big_data_test.table_with_pk_for_load_1B fields terminated by '|' lines terminated by '\n' ignore 1 lines parallel 'true';

insert into big_data_test.table_with_com_pk_index_for_insert_1B(id,col1,col2,col3,col4,col5,col6,col7,col8,col9,col10,col11,col12,col13,col14,col15,col16,col17,col18,col19,col20,col21,col22,col23,col24,col25)  select * from big_data_test.table_with_pk_for_load_1B where col4 != -7508478199581380391;

ak sk会有安全性问题,单独联系我