Open jinchengchenghh opened 1 week ago
Would you like to refer the related Velox PRs for this feature in PR description? Which will help users to track. Thanks.
I'm also thinking about setting up a more comprehensive integration benchmark for spill performance.
The following is by the existing GHA oom tests:
Before:
After:
We have a spill performance in internal Jenkins, I have trigger it. It will run after machine is ready since serves down today.
Let's put the jenkins performance here.
Prefix sort can reduce the spill sort time by 3x with the sampled Meta production query but timsort increase the sort time by 20%. So timsort performance seems depends on the actual data pattern. After pick this PR https://github.com/facebookincubator/velox/pull/11527, a query in jenkins spill with string is 41s prefixsort vs 37s timsort vs 63s stdsort. Relevant Velox PR: https://github.com/facebookincubator/velox/pull/11384 Resolves https://github.com/apache/incubator-gluten/issues/7900