matrixorigin / matrixone

Hyperconverged cloud-edge native database
https://docs.matrixorigin.cn/en
Apache License 2.0
1.71k stars 267 forks source link

[Bug]: restore oom. #16138

Open Ariznawlll opened 2 months ago

Ariznawlll commented 2 months ago

Is there an existing issue for the same bug?

Branch Name

main

Commit ID

934856ce3e7006df740da76c608a532970b1d8a5

Other Environment Information

- Hardware parameters:
- OS type:
- Others:

Actual Behavior

image image

oom:

image

log:https://grafana.ci.matrixorigin.cn/explore?panes=%7B%228Sy%22:%7B%22datasource%22:%22loki%22,%22queries%22:%5B%7B%22refId%22:%22A%22,%22expr%22:%22%7Bnamespace%3D%5C%22mo-big-data-20240514%5C%22%7D%20%7C%3D%20%60%60%22,%22queryType%22:%22range%22,%22datasource%22:%7B%22type%22:%22loki%22,%22uid%22:%22loki%22%7D,%22editorMode%22:%22builder%22%7D%5D,%22range%22:%7B%22from%22:%22now-3h%22,%22to%22:%22now%22%7D%7D%7D&schemaVersion=1&orgId=1

profile: 2024-05-15_13_01_27.zip 2024-05-15_13_02_48.zip

Expected Behavior

No response

Steps to Reproduce

sys account connects mo:
create account acc01 admin_name = 'test_account' identified by '111';
create account acc02 admin_name = 'test_account' identified by '111';

acc01 connects mo:
create database ssb_100g;
use ssb_100g;
create table ssb_100g.date ( d_datekey date, d_date char (18), d_dayofweek char (9), d_month char (9), d_year int, d_yearmonthnum int, d_yearmonth char (7), d_daynuminweek varchar(12), d_daynuminmonth int, d_daynuminyear int, d_monthnuminyear int, d_weeknuminyear int, d_sellingseason varchar (12), d_lastdayinweekfl varchar (1), d_lastdayinmonthfl varchar (1), d_holidayfl varchar (1), d_weekdayfl varchar (1));
create table ssb_100g.customer ( c_custkey int, c_name varchar (25), c_address varchar (25), c_city char (10), c_nation char (15), c_region char (12), c_phone char (15), c_mktsegment char (10) ) cluster by (c_region, c_nation, c_city);
create table ssb_100g.lineorder ( lo_orderkey bigint, lo_linenumber int, lo_custkey int, lo_partkey int, lo_suppkey int, lo_orderdate date, lo_orderpriority char (15), lo_shippriority tinyint, lo_quantity double, lo_extendedprice double, lo_ordtotalprice double, lo_discount double, lo_revenue double, lo_supplycost double, lo_tax double, lo_commitdate date, lo_shipmode char (10) ) cluster by lo_orderdate;
create table ssb_100g.part ( p_partkey int, p_name varchar (22), p_mfgr char (6), p_category char (7), p_brand char (9), p_color varchar (11), p_type varchar (25), p_size int, p_container char (10) ) cluster by (p_mfgr, p_category, p_brand);
create table ssb_100g.supplier ( s_suppkey int, s_name char (25), s_address varchar (25), s_city char (10), s_nation char (15), s_region char (12), s_phone char (15) ) cluster by (s_region, s_nation, s_city);

load sql: please contact me 

sys account connects mo:
create snapshot sp01 for account acc01;
restore account acc01 database ssb_100g from snapshot sp01 to account acc02;

Additional information

No response

YANGGMM commented 1 month ago

还在看

YANGGMM commented 1 month ago

没有环境进行复现

aressu1985 commented 1 month ago

和松哥沟通,挪到1.3.0进行解决

YANGGMM commented 1 month ago

跟这个https://github.com/matrixorigin/matrixone/issues/16650 有关系

Ariznawlll commented 1 month ago

[0605] tke环境手动测试:

create database big_data_test;
use big_data_test;
CREATE TABLE `table_basic_for_load_100m` (
  `col1` TINYINT DEFAULT NULL,
  `col2` SMALLINT DEFAULT NULL,
  `col3` INT DEFAULT NULL,
  `col4` BIGINT DEFAULT NULL,
  `col5` TINYINT UNSIGNED DEFAULT NULL,
  `col6` SMALLINT UNSIGNED DEFAULT NULL,
  `col7` INT UNSIGNED DEFAULT NULL,
  `col8` BIGINT UNSIGNED DEFAULT NULL,
  `col9` FLOAT DEFAULT NULL,
  `col10` DOUBLE DEFAULT NULL,
  `col11` VARCHAR(255) DEFAULT NULL,
  `col12` DATE DEFAULT NULL,
  `col13` DATETIME DEFAULT NULL,
  `col14` TIMESTAMP NULL DEFAULT NULL,
  `col15` BOOL DEFAULT NULL,
  `col16` DECIMAL(16,6) DEFAULT NULL,
  `col17` TEXT DEFAULT NULL,
  `col18` JSON DEFAULT NULL,
  `col19` BLOB DEFAULT NULL,
  `col20` BINARY(255) DEFAULT NULL,
  `col21` VARBINARY(255) DEFAULT NULL,
  `col22` VECF32(3) DEFAULT NULL,
  `col23` VECF32(3) DEFAULT NULL,
  `col24` VECF64(3) DEFAULT NULL,
  `col25` VECF64(3) DEFAULT NULL
);
load data url s3option {'endpoint'='http://cos.ap-guangzhou.myqcloud.com','access_key_id'='****','secret_access_key'='****','bucket'='****', 'filepath'='mo-big-data/100000000_20_columns_load_data_new.csv'} into table big_data_test.table_basic_for_load_100m fields terminated by '|' lines terminated by '\n' ignore 1 lines parallel 'true';
create snapshot sp01 for account sys;
drop table table_basic_for_load_100m;
restore account sys database big_data_test table table_basic_for_load_100m from snapshot sp01;   -->出错位置
image

log:https://grafana.ci.matrixorigin.cn/explore?panes=%7B%22WAS%22:%7B%22datasource%22:%22loki%22,%22queries%22:%5B%7B%22refId%22:%22A%22,%22expr%22:%22%7Bnamespace%3D%5C%22mo-big-data-20240604%5C%22%7D%20%7C%3D%20%60%60%22,%22queryType%22:%22range%22,%22datasource%22:%7B%22type%22:%22loki%22,%22uid%22:%22loki%22%7D,%22editorMode%22:%22builder%22%7D%5D,%22range%22:%7B%22from%22:%22now-6h%22,%22to%22:%22now%22%7D%7D%7D&schemaVersion=1&orgId=1

YANGGMM commented 1 month ago

{"level":"INFO","time":"2024/06/05 05:19:37.898063 +0000","name":"cn-service.frontend","caller":"frontend/snapshot.go:750","msg":"[sp01] start to insert select table: table_basic_for_load_100m, insert sql: insert into big_data_test.table_basic_for_load_100m SELECT * FROM big_data_test.table_basic_for_load_100m {snapshot = 'sp01'}","uuid":"34383862-3730-3965-3963-363134663161"} 导致的oom https://grafana.ci.matrixorigin.cn/explore?panes=%7B%22WAS%22:%7B%22datasource%22:%22loki%22,%22queries%22:%5B%7B%22refId%22:%22A%22,%22expr%22:%22%7Bnamespace%3D%5C%22mo-big-data-20240604%5C%22%7D%20%7C%3D%20%60%5Bsp01%5D%60%22,%22queryType%22:%22range%22,%22datasource%22:%7B%22type%22:%22loki%22,%22uid%22:%22loki%22%7D,%22editorMode%22:%22builder%22%7D%5D,%22range%22:%7B%22from%22:%22now-6h%22,%22to%22:%22now%22%7D%7D%7D&schemaVersion=1&orgId=1

YANGGMM commented 1 month ago

还在看

YANGGMM commented 1 month ago

still working

YANGGMM commented 3 weeks ago

还在看

YANGGMM commented 3 weeks ago

还在看

YANGGMM commented 2 weeks ago

还在看

YANGGMM commented 2 weeks ago

还在看

YANGGMM commented 1 week ago

还在看

YANGGMM commented 6 days ago

还在看

YANGGMM commented 3 days ago

还在看

Ariznawlll commented 17 hours ago

[0715] main commit:cccaf5e

job url: https://github.com/matrixorigin/mo-nightly-regression/actions/runs/9936199822/job/27444584364 数据量:37g

image image

内存使用情况: https://grafana.ci.matrixorigin.cn/explore?panes=%7B%22aTR%22:%7B%22datasource%22:%22pyroscope%22,%22queries%22:%5B%7B%22groupBy%22:%5B%5D,%22labelSelector%22:%22%7Bnamespace%3D%5C%22mo-snapshot-test-20240715%5C%22%7D%22,%22queryType%22:%22both%22,%22refId%22:%22A%22,%22profileTypeId%22:%22memory:inuse_space:bytes:space:bytes%22,%22datasource%22:%7B%22type%22:%22grafana-pyroscope-datasource%22,%22uid%22:%22pyroscope%22%7D%7D%5D,%22range%22:%7B%22from%22:%22now-3h%22,%22to%22:%22now%22%7D%7D%7D&schemaVersion=1&orgId=1