matrixorigin / matrixone

Hyperconverged cloud-edge native database
https://docs.matrixorigin.cn/en
Apache License 2.0
1.79k stars 277 forks source link

improve mem reuse for multi_update #20185

Closed ouyuanning closed 2 days ago

ouyuanning commented 2 days ago

What type of PR is this?

Which issue(s) this PR fixes:

issue #19820

What this PR does / why we need it:

做了一些小优化 1、对于都是8192行过来的情况,直接拷贝一份。不用走fillData复杂的逻辑 2、SortByKey中复用 proc里的sels,不再每次make一份出来 3、SortByKey中,如果发现sels中都是排好序的,就不在调用Shuffle 4、去掉一个没必要的计算rowCount的循环 对于输入的数据量比较大,且有序的数据,会有一定的性能提升并降低内存占用

create table t1 (a int primary key, b int, c int);
INSERT INTO t1 SELECT result,result,result FROM generate_series(1,30000000) g;

这个sql,单机环境,性能有5%左右的性能提升。内存占用会从1.3G降低到1G