Open bakkot opened 1 year ago
you get the same results in the same order
Does this mean that the first promise (from the first call to .next()
) needs to settle first? Or just that it needs to settle with the first value?
I would prefer the second: that the values should be yielded in the same structural sequence, but not necessarily in the same temporal sequence. Of course, for everything but map
they're more or less the same.
But to satisfy the property "as long as the underlying iterator is race-free and the mapping function is pure, the resulting iterator is race-free", iter.map(someFunction)
could have the second promise resolve before the first.
In your example with underlying.map(thatFunction)
, I would argue that a throwing function is not pure.
Does this mean that the first promise (from the first call to
.next()
) needs to settle first? Or just that it needs to settle with the first value?
As written I don't mean for it to imply that it needs to settle first, just that it needs to settle with the first value. But I think it's a consequence that it needs to settle first, at least for helpers, at least on the assumption that we maintain the property of existing async iterators in the language that you get { done: true }
from any call to .next
which occurs sequentially after a call to .next
which ultimately rejects or settles with { done: true }
. There's no way to get that property without knowing if the previous call to the helper function is going to throw.
(Also if you're preserving order of values it's fundamentally impossible to settle later promises before earlier ones for .filter
, except in the case that you've exhausted the sequence.)
In your example with
underlying.map(thatFunction)
, I would argue that a throwing function is not pure.
Sure, that's true depending on your definitions of "pure", but I intend for it to count as pure for the purposes of this consistency property. Essentially I mean to talk about purity in a world where we model exceptions by saying that all functions are actually returning a Result
.
Revisiting this, I think the consistency property mentioned above may be too strong. In particular, the fact that it forces returned promises to settle in order (at least those prior to the first { done: true }
) means it impossible to do something like Rust's buffer_unordered, which is an incredibly useful method.
A potential weakening of the property which recovers the ability to write buffer_unordered
could be that the results are race-free up through the first rejected promise or {done: true}
. If you read past that point, you may see inconsistent results. Since all consumers of the iteration protocol stop once they encounter a rejection or {done: true}
, I think having inconsistent behavior past that point is less necessary. You have to explicitly opt in to being exposed to inconsistency by either manually consuming the next()
results, or by using buffer_unordered
.
I think we should guarantee you get the same results in the same order as if you had made the calls sequentially (assuming no funny business in the mapper/iterator).
To be more precise:
Say that an iterator is race-free if it gives the same results in the same order regardless of whether calls to its
.next
method happen sequentially or concurrently.Note that this is a very weak property: an iterator can do stuff like returning
[{ done: true }, { done: false }]
and still be race-free.The fundamental consistency property of iterator helpers is that as long as the underlying iterator is race-free and the mapping function is pure, the resulting iterator is race-free. (In the case of
flatMap
, which has multiple underlying iterators, they all need to be race-free, and also to share no state, in order for this guarantee to hold.)This immediately implies that promises have to settle in order. For example, consider an async mapping function which
{ done: true }
(or, equivalently, throws).{ done: false, value: 0 }
.Consider
it = underlying.map(thatFunction); first = it.next(); second = it.next()
. Even though a result forsecond
is available immediately, if you had put asleep(60_000)
before thesecond =
, you would have seen{ done: true }
forsecond
. So that's what you need to see in this code as well to be race-free.That is, there's no way you can get the right answer for
second
without seeing howfirst
resolves, because the result offirst
might indicate the end of iteration.[edit: actually, a promise for
{ done: true }
can settle immediately without violating the consistency property, because once you see that you know all future values will be{ done: true }
.]