matrixorigin / matrixone

Hyperconverged cloud-edge native database
https://docs.matrixorigin.cn/en
Apache License 2.0
1.77k stars 275 forks source link

[Bug]: SHOW SUBSCRIPTIONS执行过长,时长50s #17157

Closed xiaoshuwei closed 1 month ago

xiaoshuwei commented 3 months ago

Is there an existing issue for the same bug?

Branch Name

1.2-dev

Commit ID

994f3da6e

Other Environment Information

- Hardware parameters:
- OS type:
- Others:
aliyun dev环境

Actual Behavior

日志如下:

2024/06/26 06:59:13.623551 +0000 ERROR models/meta.go:205 trace {"error": "context canceled", "elapsed": "50.275409214s", "rows": -1, "sql": "/* cloud_nonuser */ SHOW SUBSCRIPTIONS;"}

发生事件为东八区2024/06/26 14:59:13.623551,实例ID为01903344-4d60-7fb3-8a5d-02dafaf20ebc show subscriptions执行时长达到50s,偶现,最近经常出现较长的show subscription的执行时间。如:

2024/06/21 10:22:03.673961 +0000 WARN models/meta.go:205 trace {"elapsed": "18.645909513s", "rows": 2, "sql": "/* cloud_nonuser */ SHOW SUBSCRIPTIONS;"}
2024/06/23 03:16:05.190764 +0000 WARN models/meta.go:205 trace {"elapsed": "7.941041296s", "rows": 0, "sql": "/* cloud_nonuser */ SHOW SUBSCRIPTIONS;"}
2024/06/24 02:57:11.670517 +0000 WARN models/meta.go:205 trace {"elapsed": "8.919531897s", "rows": 2, "sql": "/* cloud_nonuser */ SHOW SUBSCRIPTIONS;"}
2024/06/24 03:12:58.969537 +0000 WARN models/meta.go:205 trace {"elapsed": "6.459748717s", "rows": 2, "sql": "/* cloud_nonuser */ SHOW SUBSCRIPTIONS;"}
2024/06/24 07:05:34.041415 +0000 WARN models/meta.go:205 trace {"elapsed": "6.595885059s", "rows": 2, "sql": "/* cloud_nonuser */ SHOW SUBSCRIPTIONS;"}
2024/06/24 07:14:37.261123 +0000 WARN models/meta.go:205 trace {"elapsed": "8.534372255s", "rows": 2, "sql": "/* cloud_nonuser */ SHOW SUBSCRIPTIONS;"}

没有列举完,但时长比较不稳定。

Expected Behavior

正常时长,正常执行日志如下:

2024/06/24 03:13:41.657093 +0000 DEBUG models/meta.go:205 trace {"elapsed": "96.906607ms", "rows": 2, "sql": "/* cloud_nonuser */ SHOW SUBSCRIPTIONS;"}

在1s以内

Steps to Reproduce

不稳定复现

Additional information

No response

xiaoshuwei commented 3 months ago

注: show subscription all;有相同的问题

2024/06/26 03:00:09.341368 +0000 WARN load/sub_pub.go:182 trace {"elapsed": "10.852148147s", "rows": 5, "sql": "/* cloud_nonuser */ show subscriptions all"}
daviszhen commented 3 months ago

需要加监控

DanielZhangQD commented 3 months ago

In Cloud dev:

2024/07/04 02:16:28.715246 +0000 WARN models/meta.go:214 trace {"elapsed": "15.817965909s", "rows": 0, "sql": "/* cloud_nonuser */ SHOW SUBSCRIPTIONS;"}
2024/07/04 02:16:48.627979 +0000 WARN load/sub_pub.go:182 trace {"elapsed": "17.568277655s", "rows": 5, "sql": "/* cloud_nonuser */ show subscriptions all"}
2024/07/04 02:16:48.643250 +0000 WARN models/meta.go:214 trace {"elapsed": "17.521654748s", "rows": 0, "sql": "/* cloud_nonuser */ SHOW SUBSCRIPTIONS;"}
ck89119 commented 1 month ago

publish-subscribe enhancement phase 2重构之后,show subscriptions不再需要切换所有租户收集信息,理论上应该没这个问题了。

先转测试继续观察。

Ariznawlll commented 1 month ago

testing

Ariznawlll commented 1 month ago

commit: dadbdfda0

测试步骤:

  1. 脚本创建acc001-acc500: acc001-acc500用于创建发布db给acc1000 createAccount.txt

  2. acc001-acc500创建db,插入数据,并发布给acc1000: pub.txt

  3. acc1000租户中使用show subscriptions all命令查看所有有权限的发布

    image

耗时:0.01s

测试通过