chapel-lang / chapel

a Productive Parallel Programming Language
https://chapel-lang.org
Other
1.79k stars 421 forks source link

How should the `Sort` module handle distributed arrays? #10760

Open LouisJenkinsCS opened 6 years ago

LouisJenkinsCS commented 6 years ago

@mppf

What should the pattern be for sorting distributed arrays? One way that I can think of is to sort the portion of the array corresponding to each locale's local subdomains and then perform a 'merge' at the end, but would this work for all sorting algorithms? Can we make use of aggregation (#10386) to batch up fine-grained operations? I don't think that there should be a single distributed sorting algorithm such as radix sort, I believe that there should be variants that the user can choose from with different trade-offs (Space Complexity, Time-Complexity, No. of Communications, etc.)

mppf commented 6 years ago

One way that I can think of is to sort the portion of the array corresponding to each locale's local subdomains and then perform a 'merge' at the end, but would this work for all sorting algorithms?

This is one of the main options. The other main option is to distribute first (divide up the key space, send each element to the right locale) and then sort locally.

I'm confident we need sort to do something reasonable with distributed arrays. I think it'd be reasonable to allow people to call other variants from the default but one think to keep in mind is that these distributed-sort variants will be different from the usual "select a sorting algorithm" choice. I.e. it won't be "bubbleSort" vs "quickSort" but rather e.g. "sort first" vs "distribute first".

mppf commented 5 years ago

So, do we need sort(Dst, Src, comparator) ?

ben-albrecht commented 5 years ago

I don't think an overload that provides a destination argument is a bad idea, even for the non-distributed cases.