Open heni02 opened 5 months ago
不知道pprof能保留多久, 先截个图
not working on it
没看还
还没看
更新最新的profile链接: 40000000_100_columns_load_data.flate :
40000000_100_columns_load_data.csv:
最新profile更新: 10000000_200_columns_load_data.csv.gz
10000000_200_columns_load_data.csv
没看
没看
没看
在本地mac1000w多行数据,压缩和非压缩耗时差距3倍左右
在深圳机器也是差不多3倍多点差距,1000w行,不过都是用的server端文件load的,不知道是不是s3load导致的差异?
暂无投入,还需要继续看
没投入
未投入
未投入
主要原因应该是压缩文件读的时候,是串行的,之前试了一下看压缩文件能做文件切割不,目前看起来压缩文件不好做切割。
她在休假
没进展
无进展
无进展
无进展
无进展
Is there an existing issue for the same bug?
Branch Name
main
Commit ID
8a5f0bd44
Other Environment Information
Actual Behavior
flate,gzip压缩文件load相比非压缩文件load耗时慢6-7倍 job:https://github.com/matrixorigin/mo-nightly-regression/actions/runs/9256599329/job/25472957901
4千万同样schema和数据,非压缩load只耗时4.4min 1千万同样schema和数据,非压缩load只耗时2.6min
load flate文件时profile:
https://grafana.ci.matrixorigin.cn/explore?panes=%7B%22GWK%22:%7B%22datasource%22:%22pyroscope%22,%22queries%22:%5B%7B%22groupBy%22:%5B%5D,%22labelSelector%22:%22%7Bnamespace%3D%5C%22mo-nightly-regression-20240527%5C%22%7D%22,%22queryType%22:%22both%22,%22refId%22:%22A%22,%22datasource%22:%7B%22type%22:%22grafana-pyroscope-datasource%22,%22uid%22:%22pyroscope%22%7D,%22profileTypeId%22:%22process_cpu:cpu:nanoseconds:cpu:nanoseconds%22%7D%5D,%22range%22:%7B%22from%22:%221716843420000%22,%22to%22:%221716845460000%22%7D%7D%7D&schemaVersion=1&orgId=1
load gzip文件时profile: https://grafana.ci.matrixorigin.cn/explore?panes=%7B%22GWK%22:%7B%22datasource%22:%22pyroscope%22,%22queries%22:%5B%7B%22groupBy%22:%5B%5D,%22labelSelector%22:%22%7Bnamespace%3D%5C%22mo-nightly-regression-20240527%5C%22%7D%22,%22queryType%22:%22both%22,%22refId%22:%22A%22,%22datasource%22:%7B%22type%22:%22grafana-pyroscope-datasource%22,%22uid%22:%22pyroscope%22%7D,%22profileTypeId%22:%22process_cpu:cpu:nanoseconds:cpu:nanoseconds%22%7D%5D,%22range%22:%7B%22from%22:%221716845460000%22,%22to%22:%221716846480000%22%7D%7D%7D&schemaVersion=1&orgId=1
Expected Behavior
No response
Steps to Reproduce
Additional information
No response