While one of the major benefits of this operator is to reduce memory required when sorting data (as it can emit early) we should also handle the case when it still can not fit everything in
Describe the solution you'd like
Add spilling support to PartialSortExec so that if it runs out of memory it will spill to disk rather than error
Is your feature request related to a problem or challenge?
PartialSortExec
was added in https://github.com/apache/arrow-datafusion/issues/7456 / https://github.com/apache/arrow-datafusion/pull/9125While one of the major benefits of this operator is to reduce memory required when sorting data (as it can emit early) we should also handle the case when it still can not fit everything in
Describe the solution you'd like
Add spilling support to
PartialSortExec
so that if it runs out of memory it will spill to disk rather than errorDescribe alternatives you've considered
No response
Additional context
https://github.com/apache/arrow-datafusion/issues/9153 tracks enabling PartialSort for more queries