Open heni02 opened 3 months ago
原始错误信息
在 2024-06-24 04:00 左右的错误信息:https://grafana.ci.matrixorigin.cn/goto/KtEhzSQIR?orgId=1
可以看到有大量的 failed to init stats info for table 以及 connect: cannot assign requested address。后者似乎是因为fd耗尽,无法bind导致,可能是有网络连接未关闭,或者同时发起太多。
而且只有 04:00 这个时间有 connect: cannot assign requested address 的错误信息:https://grafana.ci.matrixorigin.cn/goto/Agb4iSwIR?orgId=1
Is there an existing issue for the same bug?
* [x] I have checked the existing issues.
Branch Name
main
Commit ID
Other Environment Information
- Hardware parameters: - OS type: - Others:
Actual Behavior
job:https://github.com/matrixorigin/mo-nightly-regression/actions/runs/9567035983/job/26400260798
Expected Behavior
No response
Steps to Reproduce
tke regression sysbench1000w nopk-load and nopk-point-select test
Additional information
No response
这里又是另一个问题:https://grafana.ci.matrixorigin.cn/goto/nyU3iIQIR?orgId=1
看上去 cannot get table by ID 是主因,也是 disttae 里的 stats 相关。
两次重现,都和 disttae 包内的 stats 相关,先转给 @volgariver6 做进一步分析
用改过之后的分支跑,还是会有问题,目前没有定位到具体原因,只是观察到在出问题的时候,创建了大量的goroutines。
尚无进展
正在测试
目前测试还是会报错,麻烦 @reusee 继续跟踪一下
confirm,closed commit:88e486e11 https://github.com/matrixorigin/mo-nightly-regression/actions/runs/9976227425
0816: commit:081becbc9158278b3bd974075a40e3539aafa1db
job url:https://github.com/matrixorigin/mo-nightly-regression/actions/runs/10405469110/job/28842654264
@Ariznawlll 这个麻烦跑一下二分吧,估计是最近的哪个pr引起
等待二分结果
二分查到是18100pr,cc @ouyuanning
job:https://github.com/matrixorigin/mo-auto-test/actions/runs/10508265514/job/29112655376
二分查到是18100pr,cc @ouyuanning
job:https://github.com/matrixorigin/mo-auto-test/actions/runs/10508265514/job/29112655376
@heni02 确定是 18100 这个pr 么? 这个pr 只是调用了reader.Close , reader.Close 里面啥都没干,就是把一个迭代器复位了下.
报错跟#17793一样, 目前1.2-dev也发现类似问题。应该不是二分到的这2个PR引入的
更新下1.2报错信息: 1.2-dev commit:59d156b46 job:https://github.com/matrixorigin/mo-nightly-regression/actions/runs/10634487640/job/29503486397
需要采集进程的socket信息,才能确认泄漏源头
无进展
无进展
无进展
1.2-dev commit:b0f3a5481复现了,环境已配置相关参数 job:https://github.com/matrixorigin/mo-nightly-regression/actions/runs/11142247340/job/30999607062
yaml: https://github.com/matrixorigin/mo-nightly-regression/blob/main/mo-bench-tke.yaml
Is there an existing issue for the same bug?
Branch Name
main
Commit ID
6a43a6e7a
Other Environment Information
Actual Behavior
job:https://github.com/matrixorigin/mo-nightly-regression/actions/runs/9567035983/job/26400260798
mo log: https://grafana.ci.matrixorigin.cn/explore?panes=%7B%22kbI%22:%7B%22datasource%22:%22loki%22,%22queries%22:%5B%7B%22refId%22:%22A%22,%22expr%22:%22%7Bnamespace%3D%5C%22mo-nightly-regression-20240618%5C%22%7D%20%7C%3D%20%60%60%22,%22queryType%22:%22range%22,%22datasource%22:%7B%22type%22:%22loki%22,%22uid%22:%22loki%22%7D,%22editorMode%22:%22builder%22%7D%5D,%22range%22:%7B%22from%22:%221718774634000%22,%22to%22:%221718774664000%22%7D%7D%7D&schemaVersion=1&orgId=1
Expected Behavior
No response
Steps to Reproduce
Additional information
No response