matrixorigin / matrixone

Hyperconverged cloud-edge native database
https://docs.matrixorigin.cn/en
Apache License 2.0
1.79k stars 277 forks source link

[Bug]: Query from statement_info is very slow #18502

Open DanielZhangQD opened 2 months ago

DanielZhangQD commented 2 months ago

Is there an existing issue for the same bug?

Branch Name

1.2-dev

Commit ID

v1.2.2-d9b695823-2024-09-02

Other Environment Information

- Hardware parameters:
- OS type:
- Others:

Actual Behavior

mysql> explain analyze SELECT * FROM (select `statement`,system.statement_info.statement_id,IF(`status`='Running', TIMESTAMPDIFF(                                                                                                        m.statement_info.response_at,`user`,`database`,`transaction_id`,`session_id`,`rows_read`,`bytes_scan`,`result_count`,IF(status = 'Runni                                                                                                    '$[8]')),                                                                                                           ` >= '2024-08-03 06:05:44' AND `request_at` <= '2024-09-02 06:05:44' AND system.statement_info.account = '0191b0b4-3110-7459-9c57-5cb9ad5241be' AND sql_source_type IN ('cloud_user_sql','external_sql') )t ORDER BY request_at DESC LIMIT 20;select last_query_id();
Connection id:    1305347773
Current database: *** NONE ***

+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| QUERY PLAN                                                                                                                                                                                                                                                                                  |
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Project                                                                                                                                                                                                                                                                                     |
|   Analyze: timeConsumed=0ms waitTime=92527ms inputRows=16 outputRows=16 InputSize=6497bytes OutputSize=5822bytes MemorySize=0bytes                                                                                                                                                          |
|   ->  Sort                                                                                                                                                                                                                                                                                  |
|         Analyze: timeConsumed=25ms waitTime=185051ms inputRows=0 outputRows=16 InputSize=0bytes OutputSize=6497bytes MemorySize=0bytes                                                                                                                                                      |
|         Sort Key: statement_info.request_at DESC                                                                                                                                                                                                                                            |
|         Limit: 20                                                                                                                                                                                                                                                                           |
|         ->  Table Scan on system.statement_info                                                                                                                                                                                                                                             |
|               Analyze: timeConsumed=369796ms scan_time=[91067ms,91089ms,91092ms,91175ms] filter_time=[1312ms,1327ms,1351ms,1374ms] waitTime=0ms inputRows=172197608 outputRows=16 InputSize=168gb OutputSize=17mb MemorySize=565mb                                                          |
|               Filter Cond: true, (statement_info.request_at >= 2024-08-03 06:05:44.000000), (statement_info.account = '0191b0b4-3110-7459-9c57-5cb9ad5241be'), (statement_info.request_at <= 2024-09-02 06:05:44.000000), statement_info.sql_source_type in ([cloud_user_sql external_sql]) |
|               Block Filter Cond: (statement_info.request_at >= 2024-08-03 06:05:44.000000), (statement_info.request_at <= 2024-09-02 06:05:44.000000), (statement_info.account = '0191b0b4-3110-7459-9c57-5cb9ad5241be')                                                                    |
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
10 rows in set (2 min 9.53 sec)

+--------------------------------------+
| last_query_id()                      |
+--------------------------------------+
| 0191b72d-e6c6-7736-9ddb-f1ac6b7a0d8e |
+--------------------------------------+
1 row in set (0.03 sec)

Expected Behavior

Query returns in reasonable time

Steps to Reproduce

See detail in Actual Behavior

Additional information

Cloud qa env

badboynt1 commented 2 months ago

fixed

DanielZhangQD commented 2 months ago

To be verified

DanielZhangQD commented 2 months ago

Failed in cloud dev:

mysql> explain analyze SELECT * FROM (select `statement`,system.statement_info.statement_id,IF(`status`='Running', TIMESTAMPDIFF(MICROSECOND,`request_at`,now())*1000, `duration`) AS `duration`,`status`,`query_type`,`request_at`,system.statement_info.response_at,`user`,`database`,`transaction_id`,`session_id`,`rows_read`,`bytes_scan`,`result_count`,IF(status = 'Running', NULL, CAST(IF(JSON_UNQUOTE(JSON_EXTRACT(stats, '$[0]')) >= 4, JSON_UNQUOTE(JSON_EXTRACT(stats, '$[8]')), mo_cu_v1(stats, duration)) AS DECIMAL(32,4))) AS `cu` from system.statement_info where 1=1 AND `request_at` >= '2024-08-02 04:34:40' AND `request_at` <= '2024-09-02 06:34:40' AND system.statement_info.account = '0191461f-1289-719e-83cc-633fb2c0293c' AND sql_source_type IN ('cloud_user_sql','external_sql') )t ORDER BY request_at DESC LIMIT 20;

ERROR 20101 (HY000): internal error: panic inconsistent refcnt count: -1:
github.com/matrixorigin/matrixone/pkg/fileservice/memorycache.(*refcnt).release
    /go/src/github.com/matrixorigin/matrixone/pkg/fileservice/memorycache/refcnt.go:45
github.com/matrixorigin/matrixone/pkg/fileservice/memorycache.(*Data).Release
    /go/src/github.com/matrixorigin/matrixone/pkg/fileservice/memorycache/data.go:84
github.com/matrixorigin/matrixone/pkg/fileservice/memorycache.NewCache.func3
    /go/src/github.com/matrixorigin/matrixone/pkg/fileservi
DanielZhangQD commented 2 months ago

Depend on https://github.com/matrixorigin/matrixone/issues/18568

DanielZhangQD commented 2 months ago

To be verified

DanielZhangQD commented 2 months ago

To be verified

DanielZhangQD commented 2 months ago

Failed in cloud dev:

mysql> explain analyze SELECT * FROM (select `statement`,system.statement_info.statement_id,IF(`status`='Running', TIMESTAMPDIFF(MICROSECOND,`request_at`,now())*1000, `duration`) AS `duration`,`status`,`query_type`,`request_at`,system.statement_info.response_at,`user`,`database`,`transaction_id`,`session_id`,`rows_read`,`bytes_scan`,`result_count`,IF(status = 'Running', NULL, CAST(IF(JSON_UNQUOTE(JSON_EXTRACT(stats, '$[0]')) >= 4, JSON_UNQUOTE(JSON_EXTRACT(stats, '$[8]')), mo_cu_v1(stats, duration)) AS DECIMAL(32,4))) AS `cu` from system.statement_info where 1=1 AND `request_at` >= '2024-08-02 04:34:40' AND `request_at` <= '2024-09-02 06:34:40' AND system.statement_info.account = '0191461f-1289-719e-83cc-633fb2c0293c' AND sql_source_type IN ('cloud_user_sql','external_sql') )t ORDER BY request_at DESC LIMIT 20;

ERROR 20101 (HY000): internal error: panic inconsistent refcnt count: -1:
github.com/matrixorigin/matrixone/pkg/fileservice/memorycache.(*refcnt).release
  /go/src/github.com/matrixorigin/matrixone/pkg/fileservice/memorycache/refcnt.go:45
github.com/matrixorigin/matrixone/pkg/fileservice/memorycache.(*Data).Release
  /go/src/github.com/matrixorigin/matrixone/pkg/fileservice/memorycache/data.go:84
github.com/matrixorigin/matrixone/pkg/fileservice/memorycache.NewCache.func3
  /go/src/github.com/matrixorigin/matrixone/pkg/fileservi

The query hangs in the latest commit.

badboynt1 commented 2 months ago

数据分布不均匀 待查的目标账户查了五千万行才有五行 导致limit 20的优化不生效

badboynt1 commented 2 months ago

需要修改ddl为clusterby account,requestat

badboynt1 commented 1 month ago

等待修改ddl以后重测

badboynt1 commented 1 month ago

等待修改ddl以后重测

badboynt1 commented 4 weeks ago

等待在2.0环境上验证

aressu1985 commented 3 weeks ago

testing