whatwg / infra

Infra Standard
https://infra.spec.whatwg.org/
Other
118 stars 95 forks source link

"For each" is not clear what happens if the list is modified while iterating #396

Open domenic opened 3 years ago

domenic commented 3 years ago

Apparently I do this sometimes: https://wicg.github.io/app-history/#inform-app-history-about-canceling-traversals step 2 calls into https://wicg.github.io/app-history/#finalize-with-an-aborted-navigation-error which in step 10.2 will remove the item from the list.

Options:

  1. Disallow this and force the spec to make a copy of the list before iterating
  2. Create a copy automatically when the iteration starts
  3. Do something more complicated where deletions are handled transparently. I think this would look like:
    • If you delete something from the list before the currently-being-iterated value, that's OK. We ensure that it's you move on to the next value correctly (and don't e.g. skip forward 2 as an index-based approach would).
    • If you delete the currently-being-iterated value, that's OK. We ensure that you move on to the next value correctly (and don't e.g. skip forward 2 as an index-based approach would).
    • If you delete something after the currently-being-iterated value, that's OK. We ensure that it won't be seen in the rest of the iteration.

Thoughts welcome!

annevk commented 3 years ago

If you consider deletions, don't you have to consider insertions as well? It seems an algorithm that can do 3 might get quite complicated and I strongly suspect you cannot handle both arbitrary insertions and deletions well.

domenic commented 3 years ago

Yeah. Probably just disallowing it is the most reasonable.

jakearchibald commented 3 years ago

Adding to a list/set while iterating can be useful, as you discover more items to iterate over.

Could this just behave like JS?

wanderview commented 3 years ago

Without being able to run tests against the spec, how would one catch accidentally violating (1)? It seems perhaps safer to create a copy by default and then add a mutable iteration hook. Reviewers could see that mutable iteration is dangerous and perhaps do extra checking.

jakearchibald commented 3 years ago

Fwiw, I rely on mutating a list while iterating here https://github.com/whatwg/html/pull/6315/files#diff-41cf6794ba4200b839c53531555f0f3998df4cbb01a4d5cb0b94e3ca5e23947dR84175.

I don't mind rewriting it, but it seems to work pretty well.

domfarolino commented 3 years ago

Interesting, after Jake's comment I looked at forEach, and I didn't realize that it always took a snapshot of the length so if you insert items in the middle of iterating, you will never actually get to those pre-existing items towards the end of the list. (Interestingly if you remove items while iterating, the algo doesn't blow up because it caters to the situation where the length-snapshot is longer than the live-list).

Could this just behave like JS?

I kind of like this idea too. Out of curiosity, @jakearchibald does your PR rely on appending things mid-iteration and then also iterating over them later in the same for-each loop? Because the JS model would break that I believe.

domenic commented 3 years ago

Fascinating, I didn't realize that was how forEach() behaves. It's not how for-of behaves! (Since that relies on an iterator protocol to lazily get the next item each time, as far as I can tell. Although it also does a length check in the spec? https://jsbin.com/ledukitule/edit?html,console )

tabatkins commented 2 years ago

I'm rewriting some WebIDL stuff so maplike/setlike can use Infra maps/sets, and just ran into this as well!

Note that JS does allow you to insert arbitrary items in the middle of iteration, and will continue iterating over them for Maps and Sets; you can infinite-loop yourself like this. Note that step 2.d.iii.6 of CreateMapIterator recomputes numEntries at the end of every loop, after the yield has returned from author code, so it can take into account any changes that author code has made while the function was paused. forEach does essentially the same thing but more implicitly with a "spec iterator" over the entries. (The two are different only for historical reasons, per bakkot; they were last tweaked at different times and haven't fully merged their editorial styles, but are intended to have the same behavior in this regard.)

Array's forEach does, indeed, do something different and prevent infinite-looping. Array's @@iterator does do a length check, but only to tell when it's past the (current!) length of the array; it recomputes the length on every iteration so you're allowed to grow it in each iteration. Array's forEach behavior is probably an legacy detail, then; it seems the platform generally allows and respects mid-iteration mutations in collections. If you're inserting at the end, you're guaranteed to see it; if you're inserting anywhere else, you'll see it if we haven't already passed that index in our iteration.

So yeah, we should capture these details in Infra.

tabatkins commented 2 years ago

I'll post a PR in a bit (in the middle of some other edits right now), but my intention is just to specify that, when iterating over any of the Infra collections, internally the iteration maintains an internal index, initially 0 and incremented at the end of each iteration, which it checks against the current collection size at the start of each iteration, and uses to fetch the entry each iteration will use. Spec text (including author code) can run between "fetch the indexth entry" and "increment index", changing the length and/or order of the collection, but we'll still get a well-defined iteration behavior, and it'll match most of JS's behavior.