Do we have the wrong behavior currently for loops over `zip(1.., expr)`? (zipped unbounded ranges)

While writing some code last week, I was surprised to find that a loop like:

for (i,j) in zip(1.., 1..5) do
  writeln((i,j));

currently loops five times then stops without complaint. My expectation was that it would loop five times and then complain of a size mismatch between the two iterands. Specifically, Chapel generally complains about size mismatches in serial loops (and tries to in parallel loops, though it's unsuccessful in some cases, noted in issue #11428); and it seems that it ought to be complaining in this case as well.

In the parallel zippered case, we've traditionally given unbounded ranges in follower positions special "can conform to anything" semantics, which permit loops like:

forall (a, i) in zip(my1dArray, 0..) do ...

and we've discussed extending leader-follower iterators to permit users to write their own "conforming" iterators, though that has not yet been implemented. For that reason, it does not surprise me that:

for (i,j) in zip(1..5, 1..) do

would run five times then terminate without complaint, but I would expect that if an unbounded range is in the first/leader position of the zip() that the loop would be conceptually infinite. Note that this can be useful in cases where the loop contains a break/return/exit, as in this (toy) example:

for (i,j) in zip(0.., genRandNums()) {
  if j == 0.5 {
    writeln("Got 0.5 on iteration ", i);
    break;
  }
}

This issue asks whether we've made a mistake here, or whether there's some rationale for the current behavior that I'm missing.

I really like the current behavior, and in polling I actually voted for it as being desirable. It seems like if one of the participants in a zip is an unbounded range, then it should never be consulted about terminating the loop, regardless of whether it is in the leader or the follower position (it doesn't really make sense to - it's conceptually infinite, disregarding overflow issues). So it seems natural to me that if the leader is unbounded 0.., we'd consult the follower 1..5 about when to terminate the loop.

I'm naive as to the implementation of zip/leader/followers and all that jazz, so my uninformed opinion is that it'd be a shame if this didn't work (as it seems a natural way to express the pattern), and it'd also be disappointing if I had to reorder the zip expressions so that the unbounded range appears in the follower position.

EDIT: It looks like we already give unbounded ranges special treatment in the parallel zippered case. Could we extend those semantics to the serial iteration case? (I guess that's what this question might be asking?)

Note that the iterators primer has an example that specifically calls out this behavior (even with the same ordering): https://chapel-lang.org/docs/latest/primers/iterators.html#fibonacci - last code block before "multiloop"

It looks like we already give unbounded ranges special treatment in the parallel zippered case. Could we extend those semantics to the serial iteration case? (I guess that's what this question might be asking?)

We may be thinking about different things as the "special treatment" here, but I'd argue that we've already extended the parallel zippered context's special handling of unbounded ranges to the serial case (in the sense that "no size check is required for unbounded ranges"). Yet, where the parallel zippered case only permits unbounded ranges to be in the follower position, the serial zippered case permits them to be in any position, giving them the same "conform to whatever is required" interpretation.

I'm naive as to the implementation of zip/leader/followers and all that jazz...and it'd also be disappointing if I had to reorder the zip expressions so that the unbounded range appears in the follower position.

I don't think a performance-minded Chapel programmer can get very far with the language without understanding the roles of leader/follower iterands, so don't think we should consider that an advanced topic that users don't need to concern themselves with. Specifically, when you do a forall loop over distinct things (say, a distributed array, a local array, a range, and/or a user iterator), it's key to understand which iterand makes decisions like how many tasks we're running, where they're running, and how the iteration space is divided between them. For example, forall (a, b) in zip(MyDistributedArray, MyLocalArray) behaves very differently from forall (a, b) in zip(MyLocalArray, MyDistributedArray) in Chapel: the former will create tasks across all the cores of the locales that own the distributed array; the second will only use create tasks for local cores. And then the policy about which tasks own which iterations depends on the details of the distribution.

So Chapel is a language where order in zippering matters. And I think that making it matter in the serial case not only makes it more consistent, but also more powerful, even though it does require a user to learn that "order matters."

it should never be consulted about terminating the loop, regardless of whether it is in the leader or the follower position (it doesn't really make sense to - it's conceptually infinite, disregarding overflow issues).

That interpretation means that there's no way for me to write a zippered loop that I intend to be "conceptually unbounded", but where the expected behavior is to have a break/return/exit to get out of the loop (and where I'd want the implementation to complain to me if I didn't get that expected behavior). To me, that seems slightly non-orthogonal and limiting.

As an example, here's a toy code where I'm happy to be able to loop over an unbounded range and the range itself conceptually means "iterate forever" even though we know it'll terminate as intended as long as I don't choose a too-large n:

proc findNthPrime(in n) {
  for i in (1..) {
    if isPrime(i) {
      n -= 1;
      if n == 0 then
       writeln("Nth prime is: ", i);
       return;
    }
  }
}

I'd argue that if I do choose an n that's too large, it's nice that the implementation will (now) yell at me when I run off the end of the range rather than just silently terminating.

So what seems inconsistent to me (knowing that order matters in zippered contexts in Chapel) is that if I zip something else into the loop above, that second thing gets to cap my iteration unnecessarily and silently. For example, imagine a pattern like:

// search through a collection for a given value
// I'm writing this code fully expecting to find the value in the collection
proc findNeedleInHaystack(haystack, needle: haystack.eltType) {
  for (i, elem) in zip(0.., haystack) do
    if elem == needle {
      writeln("Found needle in iteration ", i);
      return;
    }
  }
}

With the current behavior, if I run off the end of the collection, the code will silently proceed without complaint. Whereas with the proposed behavior, we'd get a size mismatch once the values being yielded by 0.. exceeded the haystack's size. For a given program or programmer, maybe that's a good thing and maybe it's not, but at least supporting both behaviors lets me put the iterands into an order that gives me the behavior I want rather than forcing me into a specific behavior. And it's nice that it's consistent with parallel loops in terms of "the first iterand is the one that characterizes the loop."

I also think it's inconsistent that an "infinite" user iterator like:

iter myInfiniteRepl() {
  while (1) {
    const str = readLine();
    yield str;
  }
}

will result in a size mismatch check when driving a loop like: for (str, i) in zip(myInfiniteRepl(), 1..5) but that the "infinite" iterator for an unbounded range does not.

Note that the iterators primer has an example that specifically calls out this behavior (even with the same ordering): https://chapel-lang.org/docs/latest/primers/iterators.html#fibonacci - last code block before "multiloop"

Yeah, that's one of the cases that made me realize this was our current behavior when I broke it while fixing another bug.

TLDR

As I understand it, this loop works just fine if the bounded follower is in the leader position, but otherwise runs into a corner case of the language that we either won't/can't/aren't willing to support yet. So I think we should just do the "runtime error if we exhaust the follower" approach for now, so that everything is consistent.

However ultimately I really do hope that we could allow leaders/followers to communicate with each other and enable a leader to defer to a follower(s) to decide when a loop terminates, so that this program could work as I originally expected.

We may be thinking about different things as the "special treatment" here, but I'd argue that we've already extended the parallel zippered context's special handling of unbounded ranges to the serial case (in the sense that "no size check is required for unbounded ranges"). Yet, where the parallel zippered case only permits unbounded ranges to be in the follower position, the serial zippered case permits them to be in any position, giving them the same "conform to whatever is required" interpretation.

Ok, I agree that they should be consistent.

I don't think a performance-minded Chapel programmer can get very far with the language without understanding the roles of leader/follower iterands, so don't think we should consider that an advanced topic that users don't need to concern themselves with.

Ouch, that stings 😁. Ok, I'll agree that order matters based on the examples you've given.

That interpretation means that there's no way for me to write a zippered loop that I intend to be "conceptually unbounded", but where the expected behavior is to have a break/return/exit to get out of the loop (and where I'd want the implementation to complain to me if I didn't get that expected behavior). To me, that seems slightly non-orthogonal and limiting.

It seems like the only way I could get this "conceptually unbounded" with a zip is if all expressions in the zip are unbounded ranges (or types with iterators that behave in the same way)?

If we need to know that we computed a result before exiting the loop, couldn't we just use a flag? (But I suppose that would be cumbersome.)

With the current behavior, if I run off the end of the collection, the code will silently proceed without complaint. Whereas with the proposed behavior, we'd get a size mismatch once the values being yielded by 0.. exceeded the haystack's size. For a given program or programmer, maybe that's a good thing and maybe it's not, but at least supporting both behaviors lets me put the iterands into an order that gives me the behavior I want rather than forcing me into a specific behavior.

I think the point of contention here is whether or not there should be a runtime error if we exhaust the follower and the leader is unbounded? In the "leader is finite but follower is unbounded" case we're always going to terminate.

Based on my experiences I think Chapel leans pretty hard on the "runtime error if follower exhausts first" stance. So it wouldn't be at all untoward if we were to do the same in this case (and we can justify it as you said with "order matters").

However it still seems like such a wasted opportunity to me, and I would love if there was a way for the leader and follower to communicate to each other "hey, even though I'm the leader, I give you (the follower) permission to drive this loop". Or "hey, if my follower exhausts first, exit silently".

Because when I read this loop, I can't really imagine any semantic interpretation (no matter how hard I try, lol) that isn't "zip until we exhaust the follower".

And I think I would (naively, perhaps) extend that desire to zipping any leader/followers together - I'd love to be able to control whether we get a runtime error as opposed to intentionally walking off the edge of the smallest bounded collection.

I haven't read / followed all the discussion but from the OP

My expectation was that it would loop five times and then complain of a size mismatch between the two iterands.

I agree with this

As I understand it, this loop works just fine if the bounded follower is in the leader position

Changing "bounded follower" to "bounded expression", yes.

but otherwise runs into a corner case of the language that we either won't/can't/aren't willing to support yet.

Unless I'm misunderstanding what corner case you're referring to, I wouldn't say this is running into a case we won't/can't support yet, because we are supporting it today. I just think it's inconsistent/surprising that we do so today given what we do for parallel loops (so "aren't willing to" may be the correct interpretation here).

However ultimately I really do hope that we could allow leaders/followers to communicate with each other and enable a leader to defer to a follower(s) to decide when a loop terminates, so that this program could work as I originally expected. ... However it still seems like such a wasted opportunity to me, and I would love if there was a way for the leader and follower to communicate to each other "hey, even though I'm the leader, I give you (the follower) permission to drive this loop". Or "hey, if my follower exhausts first, exit silently".

More communication between leaders and followers is definitely part of the leader-follower 2.0 plan, though there aren't (m)any coherent proposals about how to achieve it yet. I think a challenge to what you're proposing is that it would effectively require creating an interface that would permit an arbitrary set of expressions to negotiate with each other about what their loop means, and I'm not sure how to achieve that. It seems incredibly challenging to me. E.g., maybe a leader could defer to a follower, but to which follower if there are multiple? And if each follower had its own idea about what it would want to do, how would that negotiation take place? These kinds of questions are very much the reason we gave special precedence to the leader expression to begin with.

So I think we should just do the "runtime error if we exhaust the follower" approach for now, so that everything is consistent.

It seems like the only way I could get this "conceptually unbounded" with a zip is if all expressions in the zip are unbounded ranges (or types with iterators that behave in the same way)?

Mmmmaybe, though I think we're getting into questions of terminology more than behavior here due to the fact that any of our loops are going to terminate at some point, whether over bounded ranges, unbounded ranges, or collections. Put another way, if for i in 0.. is conceptually unbounded and means "iterate over non-negative integers until we exceed what we can represent or break/return out of the loop" then in my proposal for (i,a) in zip(0:int(w).., MyArray) is also conceptually unbounded, but practically it's going to have a problem if that break/return doesn't occur in the first MyArray.size iterations, or the array is larger than the integers represented by 0:int(w)... So yes, it's still technically bounded, but arguably any of our loops over ranges are due to the fact that they're not arbitrary precision.

If we need to know that we computed a result before exiting the loop, couldn't we just use a flag? (But I suppose that would be cumbersome.)

Right, it's not that there aren't other ways to get the "follower didn't have enough elements" behavior in the current implementation (e.g., add a halt after the loop), it's just that (for me), it feels too inconsistent with parallel loops (which is why it was so surprising to me) and slightly inconvenient to require given the inconsistency.

I think the point of contention here is whether or not there should be a runtime error if we exhaust the follower and the leader is unbounded? In the "leader is finite but follower is unbounded" case we're always going to terminate.

Right.

Based on my experiences I think Chapel leans pretty hard on the "runtime error if follower exhausts first" stance. So it wouldn't be at all untoward if we were to do the same in this case (and we can justify it as you said with "order matters").

Right, though in a zippered iteration over two bounded things of different sizes, it should give a runtime error whichever one is first. We just have a bug in our forall loops currently that doesn't always do that. Put another way, forall (i,j) in zip(1..3, 1..4) isn't intended to be legal in Chapel even though users can get away with it today.

So I think of it as:

leader says what we're iterating over and how
follower says / should be able to say whether it can handle that leadership or not, generating a compile-time error if it knows at compile-time, and an execution-time error if it can't at runtime (e.g., "I'm just realizing I don't have enough elements, or too many")"
bounded ranges, arrays, and domains require very strict size/shape mismatches to avoid errors
unbounded ranges, when leading, are conceptually unbounded; when following, are happy to conform to other (1D only?) things that are leading regardless of size (well, as long as their idxType and range of integers supports that many unique values, I guess)

One other quick thought on this this morning. Beyond simply arguing slightly abstractly that it would be asymmetrical for our serial loops not to let the leader establish the size/shape of the iteration space, here's a plausible case where it wouldn't simply be abstract.

Say we supported a parallel leader iterator over an unbounded range like 1.. (which we don't today, though it could make sense for programs that were going to terminate within the loop, or in the presence of adding a eureka concept). Such an iterator might deal iterations out to tasks in a cyclic or block-cyclic manner, for example, making such a loop conceptually unbounded. If we were then to change it from a forall to a for, we'd get a very different interpretation, which seems unfortunate/surprising.

I ran back into this today and have put together a quick attempt to implement the proposed change in https://github.com/bradcray/chapel/tree/serial-zip-leader-leads to understand the implications. Running testing now.

Based on the thumbs-ups here, DavidL's acquiescence and no other objections, I'm currently planning on proceeding with this in #20891.

chapel-lang / chapel

Do we have the wrong behavior currently for loops over `zip(1.., expr)`? (zipped unbounded ranges) #20051