chapel-lang / chapel

a Productive Parallel Programming Language
https://chapel-lang.org
Other
1.78k stars 419 forks source link

can swap on arrays be supported? #12559

Open mppf opened 5 years ago

mppf commented 5 years ago

This issue is about to what extent the swap operator could be supported on arrays. (The swap operator is currently named <=> but there is some discussion about renaming it in #8545).

It might be appealing to implement <=> on arrays as swapping the pointers when the types are the same. There are several problems with this:

  1. We might like to assume that the pointer pointed to by the _array record is constant. See for example #10160. Additionally I believe remote value forwarding depends on this property. (And probably other optimizations do too).
  2. Swapping two arrays with different run-time-types might have strange behavior.

Optimizations depending on constant pointer

Arguably such optimization could check for the use of swap or arrays, and stop assuming that the pointer inside of the _array record is invariant.

Alternatively, the implementation of swap on arrays could swap internal elements of the pointed-to _instance. For example, in a DefaultRectangular array, it could swap the fields of DefaultRectangularArr, such as the pointer to the data elements. This would preserve the invariant _array field (but probably require the swap to move more data, but still not do O(array size) work). This second option seems relatively easy to implement and does not bring other costs.

Different Run-time Types

For example

proc test() {
  var A:[1..10] int;
  var B:[1..9] int;
  A <=> B;
  // Now A is over 1..10 and B is over 1..9 ?
}

Of course we could make it a runtime error to swap two arrays with different run-time types. However I think the above behavior is reasonable, given that these are runtime types. Perhaps one should think of it as equivalent to:

proc test() {
  var ADomain = {1..10};
  var A:[ADomain] int;
  var BDomain = {1..9};
  var B:[BDomain] int;
  {
     // First, swap the domains of the two arrays
     var tmpDomain = ADomain;
     var tmp = A;
     ADomain = BDomain;
     BDomain = tmpDomain;
     // now A and B have swapped domains but not elements

     // swap the elements
     var tmp = A;
     A = B[A.domain];
     B = tmp[B.domain];
  }
  // Now A is over 1..10 and B is over 1..9
}

At a high level, we might imagine swapping A and B also swaps which domain these arrays refer to - i.e. it also swaps their runtime types.

So, then the question is whether or not we allow the swap operator to change the domain of the arrays without mentioning the domain explicitly. We could, on A <=> B for arrays A and B:

I think that the last option has an interesting relationship with sorting arrays-of-arrays. Right now, the language/compiler has best support for arrays-of-arrays where all of the inner arrays have the same domain. However we also have some desire to support skyline arrays. Towards skyline arrays, we currently have some support for arrays-of-arrays where there is no declared inner domain. For example, see test/arrays/ferguson/from-iterator/skyline.chpl:

proc makeArray(i:int) {
  var A:[1..i] int;
  A[1] = i;
  return A;
}

// AA is an array of arrays
var AA = for i in r do makeArray(i);

It's important to note that we might decide not to support this pattern at all. Anyway, in this example, the inner arrays all have differing sizes.

Now what would happen if you ran sort on the array of arrays AA? Let's ignore for the moment the question default comparator and assume that the user provided a comparator for the inner array elements. Now, sort should generally be implemented with the swap operator <=>, since it need not actually destroy any elements. (And in fact, in this case, trying to have sort implemented with = to move elements around would lead to array size mismatch errors).

So, consider some part inside of sort. It's just figured out that AA[i] needs to go before AA[j] but they currently aren't in that order. So it uses AA[i] <=> AA[j].

In this case, there is no way to get the original domain for AA[i] or AA[j] to call swapArraysAndDomains. Does that mean:

a) There is a problem with this strategy for skyline arrays, or b) We need to be able to swap arrays without reference to their domain.

bradcray commented 3 years ago

I didn't recall this issue, but stumbled across it while looking to close one I'd filed about implementing <=> on arrays as pointer swaps after @mcdobe100's work this summer. I feel as though the <=> operator on arrays should not work for arrays of different sizes like the 1..9 and 1..10 examples above because I think of <=> as being a bi-directional assignment which ought to behave similarly to:

temp = B;
B = A;
A = temp;

and since we don't support assignments between arrays of different sizes, we shouldn't support it here either.

However, if this pattern were important to someone, I could see supporting it through some sort of method or standalone function so my negative reaction is only about using the <=> operator for this purpose.