Open vasslitvinov opened 5 months ago
A somewhat-related FAQ: why do we have _iteratorClass
and _iteratorRecord
and how are they different?
My mental model is that during execution an IR (iterator record) is created initially, containing the arguments that the iterator function was invoked with. Then it is passed to getIterator() to create an IC (iterator class), which holds the state that the iterator needs to retain across its yield statements. The IR is not used while the iterator is running. The IC is deinitialized once the iterator is done, when the loop body executes break, throws, etc.
Here are some clarifications on the CURRENT resolution rules to go with our parallel iterators primer. However, let us start with design questions relating to the current rules:
Is the serial iterator required when its parallel counterpart is invoked in a parallel loop?
For example, given:
is it sufficient to declare:
or is it an error when this serial iterator is absent:
Currently the serial iterator is required in some cases and not required in others.
Other design questions include:
When both the serial and the parallel iterators are required or otherwise looked at, how closely should their formal arguments match? For example, can one of the family have an in-intent
int(8)
formal and another a ref-intentint(64)
formal? Can the positionally-same formals have different names?When invoking a leader and a follower, should the actuals to the leader invocation be cached and passed to all the follower invocations? Currently they are cached.
When the first iterable of a zippered
[ ]
loop resolves to a leader+follower, are the remaining iterables required to be followers? Or when one of the followers is not available, should all iterables fall back to the serial iterators? If they do, should there be a warning?A potential simple style for defining the rules
We could define the semantics using the rewrite rules to make some aspects of the current implementation explicit. For example:
Some current rules
Given
the compiler resolves
foo(tag == leader)
and two timesfoo(tag == follower, followThis)
. These obviously must resolve to iterators, not procs. It might be OK for any of these to resolve to a proc marked with pragma "fn returns iterator".The implementation expects the parallel iterators to be DECLARED with
tag
and (for followers)followThis
formals. The users are not supposed to passtag
andfollowThis
actual arguments explicitly.When resolving the parallel iterators for a given parallel loop, the compiler implicitly generates the call by adding
tag
and (for followers)followThis
as NAMED actuals, then the normal resolution rules apply. This means that the parallel iterators must name their formals exactly that.Given
the compiler tries first to resolve
myIter(a, b, tag=iterKind.standalone)
. If this does not resolve, it triesmyIter(a, b, tag=iterKind.leader)
; assuming success it then attempts to resolvemyIter(a, b, tag=iterKind.follower, followThis=a value of the yielded type of the leader)
.A follower iterator may have a generic
followThis
formal, a concretefollowThis
formal, or even multiple overloads expecting different types offollowThis
. The follower declaration(s) determine what leader yielded types this iterator can "follow."Cf. for a serial
for
orforeach
loop, no additional actuals are added to the iterator calls, whether it is a zippered loop or not.Back to parallel iterators, generally the user is not expected to pass
tag
and/orfollowThis
named actuals explicitly. If they do, it may lead to resolution errors, for example due to duplicate namedtag
actuals.EXCEPTION: calling an iterator with a
tag=iterKind.standalone
ortag=iterKind.leader
actual in a FOR loop turns this for-loop into a parallel loop over the corresponding iterator. For example, the current implementation uses such for-loops to forward standalone and leader iterators over an array to the iterators defined by its distribution.How to support intentionally-order-dependent forall loops?
This comes up, for example, in “Run Stencil Run” / diamond tiling computations. While it probably still works, the spec says that it’s not a legal program, because the forall loop syntax indicates that the loop body is order-independent, and it is not in these cases.
We decided we needed a separate flavor of forall loops to support this scenario.
The part that remains a wrinkle is that a forall loop is lowered by combining the loop body and the iterator implementations. The follower iterator implementation might not be order-independent, and in that case the loop is not vectorizable. Follower iterators indicate their order-independence by using
foreach
instead offor
. We have some heuristic optimization that makes simple iterator forwarding cases not need foreach.