Resolution rules for parallel loops

Here are some clarifications on the CURRENT resolution rules to go with our parallel iterators primer. However, let us start with design questions relating to the current rules:

Is the serial iterator required when its parallel counterpart is invoked in a parallel loop?

For example, given:

forall i in myIter() {...}

is it sufficient to declare:

iter myIter(param tag) where tag == iterKind.standalone { ... }

or is it an error when this serial iterator is absent:

iter myIter() { ... }

Currently the serial iterator is required in some cases and not required in others.

A potential simple style for defining the rules

We could define the semantics using the rewrite rules to make some aspects of the current implementation explicit. For example:

forall idx in myIter() { body; }
// is equivalent to
for idx in myIter(tag=iterKind.standalone) { body; }
// if the standalone iterator exists

forall (i1, i2) in zip(iter1(), iter2()) { body; }
// is equivalent to
for followMe in iter1(tag=iterKind.leader) {
  for (i1, i2) in zip(iter1(tag=iterKind.follower, followThis=followMe),
                      iter2(tag=iterKind.follower, followThis=followMe))
    { body; }
// plus additional rules for the actuals, if any

Some current rules

Given

forall tup in zip(foo(), foo()) do body;

the compiler resolves foo(tag == leader) and two times foo(tag == follower, followThis). These obviously must resolve to iterators, not procs. It might be OK for any of these to resolve to a proc marked with pragma "fn returns iterator".

The implementation expects the parallel iterators to be DECLARED with tag and (for followers) followThis formals. The users are not supposed to pass tag and followThis actual arguments explicitly.

When resolving the parallel iterators for a given parallel loop, the compiler implicitly generates the call by adding tag and (for followers) followThis as NAMED actuals, then the normal resolution rules apply. This means that the parallel iterators must name their formals exactly that.

Given

forall idx in myIter(a,b) do body;

the compiler tries first to resolve myIter(a, b, tag=iterKind.standalone). If this does not resolve, it tries myIter(a, b, tag=iterKind.leader); assuming success it then attempts to resolve myIter(a, b, tag=iterKind.follower, followThis=a value of the yielded type of the leader).

A follower iterator may have a generic followThis formal, a concrete followThis formal, or even multiple overloads expecting different types of followThis. The follower declaration(s) determine what leader yielded types this iterator can "follow."

Cf. for a serial for or foreach loop, no additional actuals are added to the iterator calls, whether it is a zippered loop or not.

Back to parallel iterators, generally the user is not expected to pass tag and/or followThis named actuals explicitly. If they do, it may lead to resolution errors, for example due to duplicate named tag actuals.

EXCEPTION: calling an iterator with a tag=iterKind.standalone or tag=iterKind.leader actual in a FOR loop turns this for-loop into a parallel loop over the corresponding iterator. For example, the current implementation uses such for-loops to forward standalone and leader iterators over an array to the iterators defined by its distribution.

How to support intentionally-order-dependent forall loops?

This comes up, for example, in “Run Stencil Run” / diamond tiling computations. While it probably still works, the spec says that it’s not a legal program, because the forall loop syntax indicates that the loop body is order-independent, and it is not in these cases.

We decided we needed a separate flavor of forall loops to support this scenario.

The part that remains a wrinkle is that a forall loop is lowered by combining the loop body and the iterator implementations. The follower iterator implementation might not be order-independent, and in that case the loop is not vectorizable. Follower iterators indicate their order-independence by using foreach instead of for. We have some heuristic optimization that makes simple iterator forwarding cases not need foreach.

chapel-lang / chapel