During query planning, the SortNode will be converted to a partial sort node, followed by gather exchange with ensureSourceOrdering to be true and only one single worker and thread do the final merge to make sure that we get the data sorted as specified.
However, for an internal feature we are developing, data to be sorted within each partition is enough, rather than globally. In order to achieve this, we need to plan the query so that we have the sort node working on partitioned data, and do not need the single threaded gathering exchange.
In this PR, I added a new field partitionedBy to the sort node, which specifies the scope for sort, with empty list to be global sort, which is the current behavior.
Motivation and Context
Described above
Impact
Enable sort within partitions
Test Plan
Add unit tests
Since the partitioned by attributed will always be empty after parser, and only be set in optimizer, it has no change for current production.
Description
During query planning, the SortNode will be converted to a partial sort node, followed by gather exchange with
ensureSourceOrdering
to be true and only one single worker and thread do the final merge to make sure that we get the data sorted as specified.However, for an internal feature we are developing, data to be sorted within each partition is enough, rather than globally. In order to achieve this, we need to plan the query so that we have the sort node working on partitioned data, and do not need the single threaded gathering exchange.
In this PR, I added a new field partitionedBy to the sort node, which specifies the scope for sort, with empty list to be global sort, which is the current behavior.
Motivation and Context
Described above
Impact
Enable sort within partitions
Test Plan
Add unit tests Since the partitioned by attributed will always be empty after parser, and only be set in optimizer, it has no change for current production.
Contributor checklist
Release Notes
Please follow release notes guidelines and fill in the release notes below.