Adhoc discussion on forall intents

Subteam

@mppf
@stonea
@e-kayrakli
@bradcray

What is the API being presented?

The default intent for all types (except sync/atomic) is const. We currently apply the same default intent rule to all uses in the language (argument intents, this-intents, task intents), with the exception of forall intents. forall intents are mostly the default intent, except for arrays, which use the default intent of ref if modified. There is a warning for relying on this behavior, off by default, under --warn-unstable.

What's the history of the feature, if it already exists?

During discussion for 2.0, we decided to simplify our intents in https://github.com/chapel-lang/chapel/issues/21398 and https://github.com/Cray/chapel-private/issues/5093. This involved changing the default intent for arrays to const in all cases, including forall intents. After this change was made, prior to the release of 1.32, some developers and users (specifically Arkouda devs) balked at this change as maybe being a step too far. This caused us to change the warning for default intents to be unstable and off by default.

Some of the loops that gave people pause.

Example 1

forall i in 1..10 with (ref A) do
 A[i] = i;

Example 2

forall i in A.domain with (ref A) do
 A[i] = i;

Example 3

forall i in 1..n with (ref A) do
 A[i] = B[i];

Example 4

[i in 1..n with (ref A)] A[i] = B[i];

Example 5

// from arkouda
forall i in mySegInds
 with (var agg = new SrcAggregator(int),
       ref mySeqs, ref myLens, ref myELens, ref myESeqs) {
 agg.copy(mySegs[i], segments[i]);
 agg.copy(myLens[i], lengths[i]);
 agg.copy(myELens[i], eLengths[i]);
 agg.copy(myESegs[i], encodeOffsets[i]);
}

Example 6

// from arkouda
forall (i, column, ot, si, ui, ri, bi, segidx) in
 zip(0..#ncols, sym_names, col_objTypes,
     str_idx, int_idx, real_idx, bool_idx, segment_idx)
 with (ref ptrList, ref segmentPtr, ref datatypes, ref sizeList,
       ref segarray_sizes, ref c_names, ref segment_tracking,
       ref str_vals, ref int_vals, ref real_vals, ref bool_vals) do ...

What's the precedent in other languages, if they support it? As relevant, we especially care about: a. Python b. C/C++ c. Rust d. Swift e. Julia f. Go but this list may evolve over time

Most languages don't have the same concept of intents, particularly with respect to parallel language constructs like forall.

One that comes to mind is C++ lambdas, which I find to have sometimes confusing semantics.

Note: Lambdas have the general syntax of [](){}, where the [] is the captures, the () is the arguments, and the {} is the body (there are many more syntax oddities you can do). The [] is what I will focus on. We can create and immediately invoke a do-nothing lambda as [](){}();, or capture it in a variable as auto l = [](){};.

For example, for a simple int, we get perhaps reasonable behavior that is consistent.
```
int i = 10;
[]() {
std::cout << i << "\n"; // this is an error
}();
[i]() { // capture i by copy
std::cout << i << "\n";
}();
[=]() { // capture everything by copy
std::cout << i << "\n";
}();
[&i]() { // capture i by reference
std::cout << i << "\n";
}();
[&]() { // capture everything by reference
std::cout << i << "\n";
}();
[=, i]() { // error, explicit capture matches default
std::cout << i << "\n";
}();
```
Changing i to be const changes the default.
```
const int i = 10;
[]() { // capture i by copy implicitly
std::cout << i << "\n";
}();
[i]() { // capture i by copy
std::cout << i << "\n";
}();
[=]() { // capture everything by copy
std::cout << i << "\n";
}();
[&i]() { // capture i by reference
std::cout << i << "\n";
}();
[&]() { // capture everything by reference
std::cout << i << "\n";
}();
[=, i]() { // error, explicit capture matches default
std::cout << i << "\n";
}();
```
There are a few other exceptions to the lambda capturing rules, for example non-local and static variables can be captured implicitly by default. See https://en.cppreference.com/w/cpp/language/lambda and https://en.cppreference.com/w/cpp/language/lambda#Lambda_capture.

One particular usage of lambdas is with parallel algorithms in the C++ standard library. The example below is a bit contrived, but I think demonstrates the problem in other languages as well.
```
std::array<int, 5> A = {0, 1, 2, 3, 4};
std::array<int, 5> B;
std::for_each(std::execution::par, A.begin(), A.end(), [&](int a) {
B[a] = a;
});
```
Are there known Github issues with the feature? Issues posted by users are most relevant, but issues encountered by developers trying to use the feature are also valuable.
Are there features it is related to? What impact would changing or adding this feature have on those other features? Others features can live in the same module, but they can also live in other modules (e.g. a find method on arrays should keep the find method on strings and bytes in mind; the interface for maps should keep in mind other collection types likes lists, sets, heaps and arrays, etc.)

Whatever we do with forall is going to relate to task intents for other parallel structures like coforall, begin, and cobegin. This consistency argument could also be extended to argument and this intents, but these feel like less of a burden. I think we either need a really good way to rationalize the difference or make there be no difference. Intents being orthogonal across the language makes it easier to learn and remember, even if it is a complex rule.
How do you propose to solve the problem? Try to have multiple ideas and a recommendation of what we should do.
1. foralls maintain their ref if modified intent for arrays.
2. foralls have a default intent for arrays of const, just like everything else.
  
  We can just have this and make no other changes, but given the balking of developers and users to this idea there are a few mitigation things we could do.
  1. Implement compiler logic to determine "safe" indices. For example, an affine transform of the index i like 2*i+1 is guaranteed to be unique and safe if i is. So we could have iterators yield "unique" indices, and then do some compiler magic to determine the affine transforms.
    - This makes simple loops like Example 2 work without the ref intent, but something like Example 1 or Example 3 still need a ref intent.
    - My concern with this idea is that its hard to explain to users the compiler magic, and we are relying on compiler magic for code to compile. This can make it difficult for a user to look at a piece of code and reason about if it will work or not, without invoking the compiler/ide. Its hard to draw the line of where should the compiler stop and give up and hard to communicate that line to users.
  2. Add new syntax to support users changing the default intent. For example, with (ref *) says everything in scope gets ref a intent. Users could still have other intents, like with (ref *, const A, in i), which says everything gets the ref intent except A and i, which get their respective intents.
    - This doesn't really solve the problem with simple loops, it just makes the bigger ones (Example 6) a little more clean with a syntactic shorthand
    - Becomes a lazy "if you get this error, put this", instead of making the programmer think critically
    - This is true of many language constructs, just like [&] in C++. I think its hard to prevent users from shooting themselves in the foot without taking away too much expressive power. It's a tradeoff.
    - What does the with (ref *) apply to? All variables in scope? Only the variables in the closest scope? Can we limit this to a specific type (only arrays)? Maybe with (ref []) signifies that the intent only applies to arrays?
  3. Have the default forall array intents take the default from the actual. There is an old issue describing this from the original discussion adding ref if modified, https://github.com/chapel-lang/chapel/issues/5220, that calls this specialize-blank. This was also suggested by a user in a recent discussion.
    - If an an array is declared as var A: [1..10] int;, then its default forall intent is ref.
    - If an an array is declared as const A: [1..10] int;, then its default forall intent is const.
    - A few examples:
```
proc foo(A) { // default intent is `const`
 forall i in A.domain with (ref A) do A[i] = i; // we still get `const` intent, and need to put `ref A`
 // however, because `A` is const, `with (ref A)` is actually an error
}

proc bar(ref A) { // user is aware they are passing `A` by `ref`, they don't need to say it again
 forall i in A.domain do A[i] = i; // `A` is passed by default `ref` intent, yay!
 forall i in A.domain with (ref A) do A[i] = i; // users can be explicit and use the `ref` anyways
}

var A: [1..10] int;
begin with (ref A) { // `begin` still requires the explicit `ref`
 forall i in A.domain do A[i] = i; // but the `forall` doesn't, inferring from the `begin` intent
}
```
  - How do we rationalize this difference in orthogonality with other types? Other parallel constructs?
    - I think to rationalize types is easy, synchronization types are already ref by defauilt because it is in their nature to be modified. Arrays nature is to be used in forall
    - To rationalize with other parallel constructs, we could make the argument that forall is a high-level, data-parallel feature and so gets special features. Whereas coforall, begin, and cobegin are more low-level/task-parallelism features and so require users to be more explicit.

Whatever we do with forall is going to relate to task intents for other parallel structures like coforall, begin, and cobegin.

I'm having trouble determining whether you think these should be treated similarly or differently—mostly I'm guessing "differently", but in at least one place, it seemed like "similarly". My answer is "similarly"—that if we introduce an inconsistency, it'd be better to have it be between parallel contexts and argument intent contexts than between forall and other parallel constructs. So whatever rule we adopt for forall should also apply to foreach, coforall, cobegin, begin, and promotion.

I continue to find option ii.a. worthy of more discussion and consideration before throwing it out. Specifically, I think it supports the cases where we don't want to type ref while continuing to guard against potentially unintentional races in cases like A[B] = 1; or the equivalent forall loop. That said, it seems like you disagree, though I'm not understanding why:

This makes simple loops like Example 2 work without the ref intent, but something like Example 1 or Example 3 still need a ref intent.

Assuming those links are referring to the examples in the OP, examples 1 and 3 are looping over ranges, where I'm imagining the range and domain iterators would all be annotated with the "guaranteed to yield unique elements" attribute (or however we signal that property). And since the array index expressions are simply i, it doesn't seem like there's a problem there. What are you seeing as the need for ref?

If I were to describe this rule in the language spec, I might say that the pure language requires ref intents in all cases that arrays are modified, but that a given compiler implementation may choose to infer ref to save the user some trouble if it chooses to. And then in the first version of our compiler, I'd simply look for cases where [i] is used as the index for all accesses to a given variable and is being produced by an iterator so annotated. Then I'd expand support for richer expressions (e.g., affine expressions) as compelling cases come up. This is arguably a bit of a dodge and would become annoying if/when there was a second Chapel compiler, but also keeps the language spec simpler and puts the complexity into the compiler. Also, it's a non-breaking change if the language spec were to soften over time.

I'm open to ii.c. as well. I think the main downside of it is that it doesn't give users any protection against the potential for races in A[B] = 1; patterns (as well as others that ii.a. would guard against, like A[i] = A[i-1];

compiler magic

I'm categorically against calling things "compiler magic" unless the goal is to be insulting or we truly can't imagine how to do it. I think affine index expression analysis is pretty well-understood in the compiler community.

I'm having trouble determining whether you think these should be treated similarly or differently—mostly I'm guessing "differently", but in at least one place, it seemed like "similarly".

I have not totally made up my mind on this. I can make a case for either that I find convincing. On one hand, I strongly feel that language constructs should be consistent and have simple rules that don't require running the compiler to prove. I want to be able to read a piece of Chapel code and know with certainty that it will work. And so a special case for default intents for arrays on foralls bothers me a little. On the other hand, I see foralls as already being a bit special and so can rationalize having slightly relaxed/different rules for default intents for them. I do lean towards the make it consistent side though.

Assuming those links are referring to the examples in the OP, examples 1 and 3 are looping over ranges, where I'm imagining the range and domain iterators would all be annotated with the "guaranteed to yield unique elements" attribute (or however we signal that property). And since the array index expressions are simply i, it doesn't seem like there's a problem there. What are you seeing as the need for ref?

Perhaps I am misunderstanding the intention of the "guaranteed to yield unique elements" attribute, but my reasoning for the fact that these would not work is that those indices are not associated with the array, they are just from a plain range, and so would still require the ref. However, reading through this again I think I understand that its less about being associated with an array and more about being guaranteed to be unique. This does make me feel more inclined to this idea, but I still have some reservations (see my next comment).

If I were to describe this rule in the language spec, I might say that the pure language requires ref intents in all cases that arrays are modified, but that a given compiler implementation may choose to infer ref to save the user some trouble if it chooses to. And then in the first version of our compiler, I'd simply look for cases where [i] is used as the index for all accesses to a given variable and is being produced by an iterator so annotated. Then I'd expand support for richer expressions (e.g., affine expressions) as compelling cases come up.

Its this part of the argument for ii.b that I don't like. A user does not know if they need to supply an explicit ref intent until they try compile the code. It requires a compiler implementation to verify that a given piece of code is valid.

I'm open to ii.c. as well. I think the main downside of it is that it doesn't give users any protection against the potential for races in A[B] = 1; patterns (as well as others that ii.a. would guard against, like A[i] = A[i-1];

I am not as concerned about A[B] = 1;, since that has been deprecated. Are you proposing allowing this pattern again? I do take the point about A[i] = A[i-1];.

On the other hand, I see foralls as already being a bit special and so can rationalize having slightly relaxed/different rules for default intents for them.

My argument would be that if we find a given loop like:

forall i in 1..n do with (ref A) A[i] = 0;

to be annoying to write and come up with a reason for permitting the with to be dropped, then we'd surely find its foreach and coforall equivalents to be similarly annoying:

foreach i in 1..n do with (ref A) A[i] = 0;
coforall i in 1..n do with (ref A) A[i] = 0;

Moreover, since:

[i in iterableExpr] A[i] = 0;

can be rewritten by the compiler as either forall or foreach on the compiler (depending on the presence or absence of a parallel iterator), it seems like at the very least forall and foreach should be symmetric. And I think once two of the parallel constructs are symmetric, they all should be.

I think I understand that its less about being associated with an array and more about being guaranteed to be unique

Correct, I definitely don't think this should be tied to a specific property of the array, but just a property of the iterator—i.e., is it possible for this iterator to yield the same value twice. "No" for ranges, domains, and sets, "Yes" for arrays (typical arrays anyway... maybe we'd want to define a domain map that supported unique elements only), lists, bags, sets. "Yes" would be the default.

A user does not know if they need to supply an explicit ref intent until they try compile the code.

I'm not suggesting our compiler wouldn't document when it's required. E.g., in my straw proposal, I would document it as saying "In version x.yz of chpl, a ref intent need not be supplied on an array A that is strictly read/written using A[i] within the loop's body."

I'm also not opposed to simply making this part of the language definition if that's preferable (again, with the option to relax/strengthen it further over time). In either case, I think it should be documented, it's more a question of whether we want to hold all potential Chapel compilers to the same rules, or to permit some to be more capable/aggressive in their analysis than others. (to a large extent I think this is a moot distinction, as I don't see a strong need for a multiplicity of Chapel compiler front-ends in the future, just back-ends).

I am not as concerned about A[B] = 1;, since that has been deprecated. Are you proposing allowing this pattern again?

If forall b in B do A[b] = 1; is supported without a with-clause, then yes, I definitely think A[B] = 1; should be un-deprecated as well since I consider our rationale for deprecating it to have been to maintain consistency with forall loops. If we do end up in a world where with is required for the forall loop (as in ii.a.), then I think we can preserve the deprecation. ii.a. also gives good cover for why A[2..<n] = 1; would not be deprecated since this could be considered equivalent to forall i in 2..<n do A[i] = 1;.

Thinking about this more over the holiday, I was realizing that it's not really whether or not the expression is affine, but whether or not it's 1:1. So, for example, we could mark + and - as being 1:1 such that A[i+1] would work if that was the only expression used within the loop to index into A. Meanwhile, / would definitely not be 1:1 since A[i/2] could map both i=4 and 5 to A[2]. * could be marked 1:1 for a param non-zero multiplicand, such that A[2*i + 1] would be OK. Or we could rely on the user to use a with ref whenever * is used. Or we could mark * as being 1:1 since it "typically" is when used in an indexing expression, though that would obviously be wrong if we had an indexing expression like A[(i%2 == 0)*i], leading to a potential race.

Thinking about my previous comment more overnight, I'm realizing that I've oversimplified things. + is only 1:1 when one of the operands is loop-invariant. E.g., if I were to write forall (i,j) in {1..n, 1..n} do A[i+j] = 1; there would be overlap across iterations. I think this type of example is what took my mind the "affine" direction to begin with...

So "1:1" isn't the right term to apply to a procedure or operator and isn't independent of the details of the actual arguments. I was just too tangled up in the typical/likely indexing patterns that I'd been thinking about.

After doing some work in CHAMPS, I strongly lean towards the consistency argument. Whatever we do for forall, we should for coforall.

Consider the following two loop headers

        coforall loc in Locales
        with (ref numElementsOffsetInZones,
              ref zones,
              ref zoneIndicesPerFamily,
              ref numGlobalInteriorElementsPerFamily,
              ref numGlobalPeriodicElementsPerFamily,
              ref globalElementsPerFamily,
              ref globalNodesPerFamily,
              ref connectivitySizePerFamily,
              ref families) do
        on loc ...

            coforall (familyIndex, localFamilyIndex) in zip(localFamiliesIndices, 0..#numLocalFamilies)
            with (ref zoneIndicesPerFamily,
                  ref globalElementsPerFamily,
                  ref globalNodesPerFamily,
                  ref connectivitySizePerFamily,
                  ref numGlobalInteriorElementsPerFamily,
                  ref numGlobalPeriodicElementsPerFamily,
                  ref families)
          ...

Under the current deprecation rules for coforall array default intents, the with clauses are required. Without them, the loops need no with at all. So while I do think that maybe these with clauses add some clarity, they almost certainly fall into the same category as the forall loops from Arkouda that first caused concern.

Out of curiosity, what are the CHAMPS indexing expressions like for those arrays?

Out of curiosity, what are the CHAMPS indexing expressions like for those arrays?

For the first loop:

numElementsOffsetInZones, gets whole array assignment and also indexed by zoneIndex, which comes from localZonesIndices, with comes from zones.
- numElementsOffsetInZones[zoneIndex+1]=...
- numElementsOffsetInZones = + scan numElementsOffsetInZones, inside of an if here.id == 0 so it doesn't race
all of the *Family variables get indexed with familyIndex, which is actually a local variable inside a for loop inside of the coforall that gets computed based on a bunch of string comparisons.

For the second loop:

pretty much everything is indexed with familyIndex, which is in the coforall header

In my "find a dividing line between 'obviously safe' and 'non-' cases), I'd:

require the with ref for numElementsOffsetInZones on the argument that the whole-array assignment isn't provably safe (I don't want to get into the business of checking control flow to try and prove only one tasks makes it in).
let the second loop be written without a with intent on the basis of being able to reason that the familyIndex values are unique and the indexing expressions are consistent
not sure about the *Family variables. From your description, it seems difficult to imagine proving that the local variables computed in different coforall iterations will be distinct?

Summary of todays discussion

We decided to go with the default forall array intents take the default from the actual (option iic), with some caveats. Michael pointed out that we likely didn't go with this before because that discussion was also considering changing function argument formal intents. With the default intent for functions being const, this is more reasonable.

As an example

proc foo(ref A) {
forall i in A.domain do A[i] = i;
}

The forall does not need an explicit with (ref ...), because we infer that it is ref from the actual.

This is also true for the following

var A: [1..10] int;
forall i in A.domain do A[i] = i;

Again, var is modifable so the explicit intent is not needed.

In essence, If the array is modifiable outside the loop, it is modifiable inside the loop.

We also decided that we wanted to extend this to other language constructs that can have with clauses, so that those with clauses become optional if the array is modifiable outside the scope of the block. Specifically, this is coforall, cobegin, begin, and bracket loops. This will also apply to foreach loops when they get intents implemented. This will have the effect of just relaxing the previous deprecation warnings.

For both of these cases, we discussed at length the affine indices proposal, where if the compiler can prove the indexing is "safe" then the with is not required. The conclusion we came to is that rather than making this the rule for default intents (and being an error when you get it wrong), it is a warning to the user that they may be writing a race condition.

We also discussed the A[B] = 1; case, where A and B are arrays. We decided to undeprecate this case, and turn the deprecation warning into a race condition warning, which is off by default. The goal in this case will be to have a way in the code for a user to express that such an promotion is "safe", and does not need to be warned for.

Capturing a few of the big "aha" moments for me in the discussion:

We discussed what the differences were between having the default with intent for arrays come from the nature of the array itself (var, in, ref vs. const, const in, const ref) vs. the nature of the loop body and concluded that the overall effects would be similar, though the details of the error messages would be different (e.g., roughly "this const array can't be modified within the loop" vs. "this loop modifies the array, so you can't have this const array be applied to it"). We decided that while it's new implementation work, having the default be based on the array's nature felt like a more straightforward and natural flow of information than having the loop body result in the default intent—both for the user and for the compiler. The only obvious downside (to me) is that we already have an implementation of the body-based approach, but I think the new approach is simple enough to implement that this shouldn't be a deciding factor.
The point Jade already captured about how when our goal was to keep arguments and with intents symmetric for arrays, the approach of the previous point was difficult to implement, but now that we're more interested in making them asymmetric, we don't have the interprocedural complexity for the arguments and so this feels like the more reasonable local rule.
The reason I was arguing for an intermediate point in which the compiler would prove when with ref was unnecessary was fear that if we chose "ref-by-default" (for modifiable arrays), we'd lock ourselves out of being able to require with ref in "likely to be race-y codes" down the road, like A[B] = 1; or forall i in A.domain do A[i] = A[i-1];. I.e., it would be a breaking change if these suddenly required with clauses. Michael convincingly argued that a dual to my mindset was to adopt the rule that we are and then to have increasingly sophisticated race-detection warnings that would warn about such cases rather than changing the language or program behavior over them. And that one could then quiet those warnings by doing things like asserting "B has no duplicate elements" or "I know what I'm doing" on a loop like forall i in A.domain by 2 do A[i] = A[i-1]; that is race-free but may be hard/impossible for the compiler to ever prove that about. I think this is a great argument.

I proposed that we not worry too much about such race-oriented arguments in the short-term, considering it more of a long-term strategy than something that needs to get done now. Michael pointed out that the existing A[B] = 1; warning could be turned into a race warning with little effort, though that led to concerns about whether it should be on or off by default, how a user would assert that it was safe (without rewriting as a forall loop) without annotations, etc. My thinking is that it should be off by default until we have precise ways of squashing the warning (and maybe even after that as well), where I'm imagining a --warn-races or --warn-potential-races flag to opt in.

One other thing to capture: Andy asked about why, if we think this is the right thing to do for arrays, we don't do it for scalars as well (i.e., a var int is passed by ref by default, a const int is passed by const by default). My attempt at rationalizing this was:

we've seen people write races (like total += x; or count += 1;) when scalars are passed by var
that the analysis for scalars being potentially written racily is far more trivial than for arrays since they're a singleton with no indexing
that if we could write a similarly accurate / non-intrusive race condition detector for arrays, we would

Michael pointed out that one could write safe modifications to a scalar using a lock and that we could similarly make the compiler smarter about detecting patterns like that, so explained it more simply (as I understood it) as "We've chosen to optimize different data structures for different cases based on the likelihood of them being problematic."

I am closing this issue, all action items from this discussion have been merged (see #24135 and #24240). There are some followup items outlined in #23976 about further race condition warnings.

Thanks for taking care of this one, Jade! It was a big one, but I'm very happy with where we ended up with the effort!

chapel-lang / chapel

Discussion on forall intents #23819