chapel-lang / chapel

a Productive Parallel Programming Language
https://chapel-lang.org
Other
1.8k stars 421 forks source link

Are there conditions under which `with (ref myArray)` should not be required? #23488

Closed bradcray closed 4 months ago

bradcray commented 1 year ago

As I work on updating codes to be compatible with Chapel 1.32, I keep finding myself taken aback that a loop like:

forall i in 1..n do
  A[i] += 1;

now requires a with (ref A) to avoid warnings. This issue asks whether there are cases in which we could/should squash the warning in order to reduce syntactic overhead and potential annoyance without reducing safety. Some conditions in which I could imagine doing so are:

And then I think we need to combine that first style of rule with something like:

mppf commented 1 year ago

IMO the specific loop in the OP is better written as forall elt in A do elt += 1. Are you seeing other examples where a rewrite lake that wouldn't make it nicer?

bradcray commented 1 year ago

The motivation for this issue is that I think many users find it more natural to loop over an index set and use array accesses rather than iterating over the array itself, based on languages they're coming from. Moreover, if the same index is used for multiple arrays, I think it can be nicer to write:

forall i in Domain do
  A[i] = B[i] + C[i];

than:

forall i in Domain with (ref A) do
  A[i] = B[i] + C[i];

or:

forall (i, a) in zip(Domain, A) do
  a = B[i] + C[i];

or:

forall (a, b, c) in zip(A, B, C) do
  a = b + c;

If we were able to handle the case where the written array index expression was an affine transformation of 'i', it also feels more straightforward to write that using indexing rather than as an array iteration over an array slice (say)—and potentially cheaper.

bradcray commented 1 year ago

(I should also add that I realize there's a slippery slope here and am not saying "obviously we should do this." But I do think it's an obvious question to ask and wonder whether we'll get similar reactions from users).

jabraham17 commented 1 year ago

This feels related to https://github.com/chapel-lang/chapel/issues/23229, which asks similar questions about whether a variable like ref mySlice also then needs a with (ref mySlice).

One of my concerns would be in the body of a large loop, having the with (ref ...) at the top makes it very clear what is modified and what is not.

bmcdonald3 commented 1 year ago

When resolving the Arkouda deprecation warnings for this release, there were a huge number of cases that had to have ref added to them (see https://github.com/Bears-R-Us/arkouda/commit/2d77681659eecf7c64d4b225081089229be0a993). This made me think about this change, thinking that if so much code had been written without people thinking about it, it could be concerning that now they would have to think about it. In other words, if every file in Arkouda had to have a modification made to accommodate the requiring of arrays to be passed as ref, that means that at least this subset of developers will have to shift their paradigms of developing.

After seeing the changes in Arkouda and realizing those developing Arkouda will have to keep this change in mind, I wrote up a couple of small examples:

// this is a warning
proc noRefIntent() {
  var a: [0..10] int;
  forall i in 0..10 do
    a[i] += 1;
}

// this is fine
proc refIntent() {
  var a: [0..10] int;
  forall i in 0..10 with (ref a) do
    a[i] += 1;
}

// this is a warning
proc noRefIntent2() {
  var a: [0..10] int;
  var b: [0..10] int;
  [i in a.domain] b[i] = a[i];
}

// this is fine
proc refIntent2() {
  var a: [0..10] int;
  var b: [0..10] int;
  [i in a.domain with (ref b)] b[i] = a[i];
}

Thinking about these in the context of teaching Chapel to someone, I think it is unfortunate that a new user who is writing a very simple loop will have to take into account the with intent for arrays. Of course, these functions could be written differently to loop over the array specifically, etc., but I think the point here is that a beginning user won't have that knowledge and could have a slight hiccup with needing to add the ref.

In Arkouda itself, the changes were a bit more complex, where an example is this loop:

forall (i, column, ot, si, ui, ri, bi, segidx) in zip(0..#ncols, sym_names, col_objTypes, str_idx, int_idx, real_idx, bool_idx, segment_idx)

Turning into this loop:

forall (i, column, ot, si, ui, ri, bi, segidx) in zip(0..#ncols, sym_names, col_objTypes, str_idx, int_idx, real_idx, bool_idx, segment_idx) with (ref pt\
rList, ref segmentPtr, ref datatypes, ref sizeList, ref segarray_sizes, ref c_names, ref segment_tracking, ref str_vals, ref int_vals, ref real_vals, ref bool_\
vals, ref objTypes)

This is another case where the code could be refactored or something to make it less noticeable, but I think the point here is that this is actual code that someone wrote, so refactoring the code seems a little bit like asking the developer to change their approach to programming, rather than making it easier for a user.

jabraham17 commented 1 year ago

I want to capture some thoughts here, which could possibly go under their own issue but are highly relevant here.

On large complex foralls, like in Arkouda, the sheer number of arrays modified inside a loop turns the loop header into a mess. If we had a way to write a blanket intent, this would greatly simplify this.

We have a note in the spec (https://chapel-lang.org/docs/language/spec/task-parallelism-and-synchronization.html#task-intents)

For a given intent, we would also like to provide a blanket clause, which would apply the intent to all variables. An example of syntax for a blanket ref intent would be ref *.

I am thinking of this similarly to how lambda captures work in C++. In C++, [=] says to pass all outer variables as copies and [&] says to pass all outer variables as modifiable references.

A blanket intent could turn the above Arkouda loop into:

forall (i, column, ot, si, ui, ri, bi, segidx) in zip(0..#ncols, sym_names, col_objTypes, str_idx, int_idx, real_idx, bool_idx, segment_idx) with (ref *)

I think it would be interesting to define what outer variables means here. Would it literally take ALL outer variables by ref (similar to nested functions) or maybe just the ones in the closest scope. Would we want to limit ref * to only work on arrays, probably not but just putting the idea out there.

This is not a complete solution and doesn't address other cases (like forall i in A.domain do A[i] = i;), but might improve the situation.

bradcray commented 1 year ago

Continuing to think about this while working on release notes, I found myself wondering "In the current behavior with --warn-unstable, is it possible to write a forall loop that iterates over a range or domain (without zippering in every other thing being modified) that does something meaningful as a function of i?" My guess is "no" since any results that want to persist outside of the loop would need to be stored within something as a function of i, requiring a non-local modification. But I'd be happy to be proven wrong.

jabraham17 commented 4 months ago

This was discussed and resolved in https://github.com/chapel-lang/chapel/issues/23819, we no longer require with (ref myArray) on modifiable arrays