Open LouisJenkinsCS opened 6 years ago
One way that I can think of is to sort the portion of the array corresponding to each locale's local subdomains and then perform a 'merge' at the end, but would this work for all sorting algorithms?
This is one of the main options. The other main option is to distribute first (divide up the key space, send each element to the right locale) and then sort locally.
I'm confident we need sort
to do something reasonable with distributed arrays. I think it'd be reasonable to allow people to call other variants from the default but one think to keep in mind is that these distributed-sort variants will be different from the usual "select a sorting algorithm" choice. I.e. it won't be "bubbleSort" vs "quickSort" but rather e.g. "sort first" vs "distribute first".
So, do we need sort(Dst, Src, comparator) ?
I don't think an overload that provides a destination argument is a bad idea, even for the non-distributed cases.
@mppf
What should the pattern be for sorting distributed arrays? One way that I can think of is to sort the portion of the array corresponding to each locale's local subdomains and then perform a 'merge' at the end, but would this work for all sorting algorithms? Can we make use of aggregation (#10386) to batch up fine-grained operations? I don't think that there should be a single distributed sorting algorithm such as radix sort, I believe that there should be variants that the user can choose from with different trade-offs (Space Complexity, Time-Complexity, No. of Communications, etc.)