tc39 / proposal-deiter

Double-Ended Iterator and Destructuring
MIT License
70 stars 3 forks source link

nextLast() instead of next("last") #11

Closed hax closed 2 years ago

hax commented 2 years ago

In the plenary which this proposal was first presented, it was suggested (by @codehag if I recall correctly) to use nextBack() instead of next("back"). As #9 / #10 changed "back" to "last", it will be nextLast() instead of next("last").

The reason I didn't follow Rust nextBack() API in the first place, is because the consideration of generators (https://github.com/tc39/proposal-deiter/issues/9#issuecomment-1143128315) . But I'm reconsidering it.

Several points:

  1. It seems could help to solve the confusion @ljharb commented.
  2. Do not need magic string. When I introduced the proposal to the js programmers, people often ask if it possible to not use magic string. Some suggested use symbol. We shouldn't use symbol because symbols are for protocols. Actually it's common to use magic strings as enums in JS APIs, but obviously many people don't like magic string anyway.
  3. nextLast() looks even more consistent with findLast, lastIndexOf and potential iterator helpers takeLast, dropLast, etc.
  4. The existence of nextLast could be used to test whether an iterator is a double-ended iterator (or support reverse iteration), so we could drop separate symbols @@doubleEndedIterator (or @@reverseIterator), and also do not need separate flag fields (which next("last") may need). This seems the simplest solution of #5 .
  5. next("last") could allow writing deiter in generators with function.sent proposal. Unfortunately, function.sent proposal have several issues, I try to reshape the syntax from meta property function.sent to "receive param", but as previous meeting, we are still far from consensus. So I'd like to drop the initial idea of "direct support writing deiter in generators", focus on the double-ended iterators itself, leave the generator issue to function.sent or future proposals (nextLast() could still be layered on next("last") in such proposals).
ljharb commented 2 years ago

The word "next" appearing anywhere is the confusion, imo - "next" means forward, and "nextBack" is just as confusing.

hax commented 2 years ago

@ljharb Do u have any suggestion of the api name?

ljharb commented 2 years ago

prev or previous seems appropriate as an antonym to next, I suppose? (I'm still not convinced it's feasible to add this functionality in a nonbreaking way)

hax commented 2 years ago

@ljharb prev is not an option, it even worse than the original word back, because prev means bidirectional, not double-ended. I don't know how I can explain it better if u already read the faq.

Jack-Works commented 2 years ago

next("last") is a footgun. imagine an iterator does not support deiter, then this will happen:

let [a,     ...b, c] = noDeiterSupportIteratorOf(0, 1, 2, 3)
//   ^0th  rest^  ^1th

a // 0
c // 1 (!! we call next("last") but "last" is ignored by the next method)
b // 2, 3

but have no idea about what name we should use.

ljharb commented 2 years ago

That’s true - but double-ended without being bidirectional is also a confusing concept. It’s more like two separate iterators on the same underlying collection - if it’s one iterator, then it probably shouldn’t have multiple directions/ways to iterate?

hax commented 2 years ago

@Jack-Works Yes, we always need a mechanism to mark deiters. This was discussed in #5. For next("last"), a possible symbol based solution is https://github.com/tc39/proposal-deiter/issues/5#issuecomment-1019751577 .

As such solution,

If we move to nextLast(), all just invoke x[Symbol.iterator](), and check whether the iterator support next and/or nextLast.

hax commented 2 years ago

@ljharb , bidirectional means u can restore to previous state, double-ended means u can consume in both ends, I think the problem is not they are confusing, but they are relative abstract interfaces/concepts, unfamiliar to the js programmers.

It’s more like two separate iterators on the same underlying collection

Yeah that's why it is called "DOUBLE-ended" iterator. Normally the underlying collection (or similar structure, for example, range(start, end) ) are deque. next() and nextLast() are just shift() and pop() (as C++ style terms, popFront and popBack, as Java style terms, poll and pollLast) .

Of coz we can have two separate iterators. That's reverse iterator . But reverse iterator is not enough for let [a, ...rest, b] = x. And in almost all cases, if an collection could support reverse iterator, it could support double-ended. So I suppose it's better to only write double-ended iterator so u only need maintain the iteration mechanism in one place, though it will be a little hard at first.

The good news is only library authors need to care about iterator, most programmers just use destructuring and never need to worry about them.

conartist6 commented 2 years ago

If we move to nextLast(), all just invoke x[Symbol.iterator](), and check whether the iterator support next and/or nextLast.

@hax You never really addressed my original complaint. Let's say this proposal proceeds. I write this:

function map(iterator, fn) {
  return {
    next() {
      return fn(iterator.next());
    },
    nextLast() {
      return fn(iterator.nextLast());
    } 
  }
}

How can I tell if map(iter).nextLast() is going to be safe? I kind of just have to run it and see if it causes an iterator.nextLast is not defined error.

To avoid this problem I'd need to write map as:

function map(iterator, fn) {
  const next = () => {
    return fn(iterator.next());
  };

  return iter.nextLast ? { next, nextLast } : { next };
}

This rules out using generators to implement iterator helpers, so for starters you've made pretty much all current iterator code unsafe.

ljharb commented 2 years ago

You don’t need any iterator protocol changes to support [a, ...rest, b] = - you just need to exhaust the iterator, and have cached the last N results

Jack-Works commented 2 years ago

But the intention of this proposal is to not exhaust the iterator.

ljharb commented 2 years ago

That’s very confusing to me; any use of … should exhaust the iterator, since it’s explicitly saying “grab everything available”

Jack-Works commented 2 years ago

That’s very confusing to me; any use of … should exhaust the iterator, since it’s explicitly saying “grab everything available”

Not necessary, do you remember we introduce [a, ...] to not exhaust the iterator in the pattern matching proposal?

conartist6 commented 2 years ago

I can't get over taking what seems to be a relatively niche use and fundamentally altering the definition of what an iterator is to support it! It seems to me like you've allowed the highest level concerns to drive the lowest level concerns.

You say that it should be possible to write const [first, ...rest, last] = iter and your solution is for all iterator implementations to be rewritten to support it. I just can't imagine that happening.

ljharb commented 2 years ago

@Jack-Works thats a fair point :-P any use of it with a variable, then. How else would you fill it with the proper values?

@conartist6 not sure who is “your” here - are you talking about the proposal or about my claim that it should be achievable without changing iterators?

conartist6 commented 2 years ago

@ljharb Exactly. Since capturing rest means there's no need for this solution, this whole thing is about being able to write [first, ..., last].

Let me see if I can be even more clear about what I mean. For example, your algorithm for [first, ...rest, last] is defined in terms of having two cursors, and the iterator is done when they meet in the middle. That's because rest would also include last unless the forward cursor stops when it encounters the backwards cursor. Implementing that behavior correctly will be the responsibility of everyone who implements a deiterator.

If we say that everyone who writes an iterator is also writing part of the internal implementation of destructuring, then we must also assume that some of those implementations will be incorrect. Sometimes the author will forget to stop the forward iterator when it hits the back cursor, and since this is javascript someone will be relying on the nonsense behavior and it will never be possible to remove, and this would become one of those language features that we just tell people not to use if they value their sanity...

conartist6 commented 2 years ago

I should note that I'm not unbiased (though I hope my arguments are logical). But yeah, I have my own version of this proposal: https://es.discourse.group/t/array-prototype-reversed-map-set/556

conartist6 commented 2 years ago

Sorry, to clarify I meant "you" as "the authors of this proposal" except where I tagged Jordan. I am aware that a buffering solution is possible, and in fact I've implemented that solution as part of iterTools.slice(start, -end).

hax commented 2 years ago

@conartist6

map should be roughly like:

function map(upstream, fn) {
  let iter = { __proto__: Iterator.prototype }
  if (upstream.next) iter.next = () => fn(upstream.next())
  if (upstream.nextLast) iter.nextLast = () => fn(upstream.nextLast())
}

Of coz this code is very naive (not passing the args of next, no throw/return, etc.), and not follow the built-in conventions (use unbound prototype methods and internal slots, not bound functions). Actually iterator transformations are hard even with generators:

function *map(upstream, fn) {
  for (let v of upstream) yield fn(v)
}

Such code will invoke @@iterator so it's actually iterable helper, not iterator transformation. And it also not pass the protocol (though as iterable helper, it may be ok.)

This rules out using generators to implement iterator helpers, so for starters you've made pretty much all current iterator code unsafe.

No. All current iterators still work, they just don't support nextLast so they don't support let [..., last] = currentIter.

About generator, I also hope we could allow people write deiter via generator syntax. But even without generator, at least most built-in iterators and iterator helpers could support double-ended, so still have enough benefit.

Even in current situation, you still could write deiter via generator:

class IntRange {
  constructor(start, end) {
    if (!Number.isSafeInteger(start)) throw new TypeError()
    if (!Number.isSafeInteger(end)) throw new TypeError()
    this.start = start
    this.end = end
  }
  @doubleEnded *[Symbol.iterator]() {
    let {start, end} = this
    let method = yield
    while (start < end) {
      method = method == "nextLast"
        ? yield --end
        : yield start++
    }
  }
}

The doubleEnded decorator:

function doubleEnded(g) {
    const {next} = g.prototype
    g.prototype.start = function () {
        if (this.started) return
        this.started = true
        Reflect.apply(next, this, [])
    }
    g.prototype.next = function (v) {
        this.start()
        return Reflect.apply(next, this, ['next'])
    }
    g.prototype.nextLast = function () {
        this.start()
        return Reflect.apply(next, this, ['nextLast'])
    }
}
hax commented 2 years ago

const [first, ...rest, last] = iter and your solution is for all iterator implementations to be rewritten to support it.

See https://github.com/tc39/proposal-deiter/blob/main/why-deiter.md . It's better to open a separate issue to discuss the design choice.

hax commented 2 years ago

Implementing that behavior correctly will be the responsibility of everyone who implements a deiterator.

Only if they want to support double-ended semantic. If not (for example, stream are normally one-way), they don't need to care about deiter. And in such cases, let [..., last] = stream will throw and should throw (this is why buffered solution is bad).

If we say that everyone who writes an iterator is also writing part of the internal implementation of destructuring, then we must also assume that some of those implementations will be incorrect. Sometimes the author will forget to stop the forward iterator when it hits the back cursor, and since this is javascript someone will be relying on the nonsense behavior and it will never be possible to remove, and this would become one of those language features that we just tell people not to use if they value their sanity...

Double-ended is not easy, but not as hard as you described. next/nextLast is just shift/pop, so if somebody can write shift/pop correctly, they can write next/nextLast correctly.

The really hard part is iterator transformation, and it's already hard even without double-ended semantics, different user-land iterator transformations libraries have many subtle semantic differences. This is why we should have a standardardized built-in iterator helpers. I don't think average js programmers should do it themselves, so I would agree to tell people do not write "iterator transformation" themselves. For library authors, it's their duty to solve relative hard problem or there is no need to use library. 😅

ljharb commented 2 years ago

@hax the "why-deiter" doesn't talk about a method C, which is "support all iterables, but cache yielded results until the iterator is exhausted".

conartist6 commented 2 years ago

@ljharb that would be sound to implement, but it doesn't offer any way to do things like get the entries of a Map in reverse insertion order.

ljharb commented 2 years ago

@conartist6 sure, but the solution to that is the same as "but the default iterator doesn't provide just keys, just values, or just entries" - make a new method that produces the iterator you want, like .reverseKeys() or something - or even .keys(true). There's no reason that allowing rest syntax in a non-final position requires an additional iterator protocol or changes to the iterator protocol.

conartist6 commented 2 years ago

I'd be strongly in favor of the combination of keysReverse and [first, ...rest, last] without constant-time support for [first, ..., last].

conartist6 commented 2 years ago

@hax I disagree with you that it would be reasonable for tools to define different methods on their outputs depending on their inputs. That is something I always explicitly try to avoid for these reasons:

hax commented 2 years ago

a method C, which is "support all iterables, but cache yielded results until the iterator is exhausted".

@ljharb I don't see any difference between "Mechanism A: Based on Array" with "method C".

hax commented 2 years ago

I'd be strongly in favor of the combination of keysReverse and [first, ...rest, last] without constant-time support for [first, ..., last].

@conartist6 What's the precise semantic of "the combination of keysReverse and [first, ..., last]" ? Is that constant-time? I suppose you mean it is O(1) space, but O(n) time. If that, it sounds like half-optimization and over-engineering. As my experiences, people either don't care about perf (so they are ok about Mechanism A) or care about both space and time.

ljharb commented 2 years ago

@hax because it would never be based on arrays, because it's iterable destructuring - so it should work for all iterables, using Lists, and only reify to an array as the final step.

hax commented 2 years ago

That is something I always explicitly try to avoid for these reasons:

@conartist6 All these reasons apply to all similar things which have subtype/supertype, for example readonly collections. I agree you we need to care about all these aspects, but we need balance the different use cases, and in deiter and readonly collections cases, I think they all have enough benefit to add to the language.

It makes code harder to read It complicates debugging

Actually this is why type system are useful.

It makes types harder to write

We write types not because it's easy to write, (many js idioms require very complex type, very hard to write, actually compare to many "type gymnastics", typing deiter is "easy",) but because it could help us to maintain the code and help others to use our code.

It makes it harder for engines to execute code efficiently by introducing polymorphism.

We need engine people to estimate the cost, but as I understand, introducing deiter won't harm to engine efficiency. On the contrary, buffer-based mechanisms add burden to the engine to optimize the builtins (if people use double-ended destructuring broadly, engines will eventually be pushed to optimize them) and eventually engines would have to implement deiter-like things, but that will become builtins-only magic, and like many magics today, if userland code involves, u suddenly lost the magic and got perf cliff.

hax commented 2 years ago

@ljharb It works for all iterables. I'm sorry if "based on array" sound different, but here "based on array" only mean the semantic is close to the traditional array-based operations (get rest, and splice the last items).

Of coz the implementation could (and should) use List or something else and only reify to an array if (rest is) needed.

conartist6 commented 2 years ago

You're saying that there all these ways that we can deal with the burden of supporting this pattern. I'm saying I haven't yet seen any evidence of a use case important enough to be worth the extra complexity added to the language and the ecosystem.

hax commented 2 years ago

@conartist6 In general, I think if we want to support double-ended destructuring, the best solution is deiter, buffered solution have the issue of perf, and likely to get builtin-only optimization which cause the worse ecosystem issue. I've explained that in the why-iter doc. And note if we adopt buffered solution it make deiter never possible in the language. Even userland implementation could cause similar ecosystem issue if widely used.

I hope we can discuss more about the consequence of different possible mechanisms of double-ended destructuring, if u like. But maybe it should in a separate issue.

conartist6 commented 2 years ago

Even the example I've been using isn't really right. I keep writing out const [first, ...rest, last] = iterable but those names aren't quite representative of what's going on. last would more aptly be named lastOrUndefined because if you really want to know the last item in the collection you need to consider the possibility that you should instead be looking at first.

Your "why deiter" document doesn't really explain what all this is useful for, it just assumes that you want to do the things it allows you to do!

hax commented 2 years ago

Not only last, first could also undefined. The only important thing is the double-ended destructuring always fill first first.

Your "why deiter" document doesn't really explain what all this is useful for

I think I already said that, it explained "why deiter" (is the best solution to support double-ended destructuring and why buffered solution is not good, and there is no other possible solution as I understand.)

I'm ok if anybody think the doc is not correct and we can discuss it (just like we do here) but saying "it just assumes that you want to do the things it allows you to do" is not very helpful.

hax commented 2 years ago

from https://github.com/tc39/proposal-iterator-helpers/issues/123#issuecomment-1175657079 by @michaelficarra

I could imagine wanting a "find last element which passes some predicate" method that operates on every element from the front, avoiding the need to store the iterated elements, but we couldn't possibly call that findLast since that has different behaviour (due to possible side effects in the passed callback).

I find this comment also need to consider, findLast/takeLast/dropLast/reduceRight, etc. all such helpers can only have those names if based on nextLast() semantic, or the names are theoretically "wrong", though in practice it may be not a big issue (I'm not sure).

conartist6 commented 2 years ago

I think I already said that, it explained "why deiter" (is the best solution to support double-ended destructuring and why buffered solution is not good, and there is no other possible solution as I understand.)

You talk about solutions, but you do not explain what real problems they solve.

I understand wanting to be able to write reduceRight or findLast, but these are fundamentally the same as reduce(reverseIter) and find(reverseIter).

I also still don't see the purpose of meet-in-the-middle cursors. They make it possible to write const [first, ..., lastOfRest] = iterable, but I just don't see what real (and common) problem is solved by being able to write that!

hax commented 2 years ago

@conartist6 If you ask why add const [first, ..., last] syntax, I think the readme already have the clear motivation and further materials in Prior Art, Previous/Old discussion sections.

To be clear, if the committee think such syntax is not useful, the proposal will not advance. "Why deiter" only discuss the possible solutions to support that syntax.


Deiter VS. Reverse iterator

Reverse Iterator could be used to support findLast things, but it also require you implement reverse iterators for all iterables you want to use findLast. And map, filter etc. should also deal with reverse iterators to keep it chainable. So it doesn't have much differences with deiter.

The real main differences between reverse iterator and deiter are: Reverse iterator require you write two iterators, need to maintain the iteration algorithm correctness in two place, deiter only need you write one iterator.


About "meet-in-the-middle" complexity

Deiter does not necessarily based on "meet-in-the-middle cursors", though "meet-in-the-middle cursors" is a common pattern for deiter/deque.

It seems you claim that it's too complex to most programmers. I already respond that in previous comment, and there is one more thing, if it's a common pattern, then it's easily to have a common helpers to deal with the complexity. For example, you only need write the deiter for IntRange once, all index-based collection can use that to implement deiter very easily.

hax commented 2 years ago

As iterator helpers proposal decide to not pass the protocol, we have to switch to nextLast().

conartist6 commented 2 years ago

Reverse Iterator could be used to support findLast things, but it also require you implement reverse iterators for all iterables you want to use findLast.

Yes. Your solution also requires iterables to create new reverse iteration logic in order for it to be possible to use findLast.

And map, filter etc. should also deal with reverse iterators to keep it chainable. So it doesn't have much differences with deiter.

I disagree wholeheartedly. Only deiter requires a new definition of map that is specifically for reversed values. If "values in reverse order" is just a normal forwards iterator, then the normal map operation over forwards iterators already works great and is completely chainable.

The real main differences between reverse iterator and deiter are: Reverse iterator require you write two iterators, need to maintain the iteration algorithm correctness in two place, deiter only need you write one iterator.

It's funny, you and I see this exactly opposite. You say that technically a deiter as defined by this proposal is is "one iterator" which must be simpler than two iterators. I disagree completely. You haven't eliminated the need to write the second iterator, you've just ensured that the logic for both iterators has to share state and be intertwined.

hax commented 2 years ago

@conartist6

Your solution also requires iterables to create new reverse iteration logic in order for it to be possible to use findLast.

Double-ended iterators do not need seperate reverse iteration logic (just call nextLast() until done). So iterables could have only one double-ended iteration logic, and double-ended iterable is always both iterable and reverse iterable.

If "values in reverse order" is just a normal forwards iterator, then the normal map operation over forwards iterators already works great and is completely chainable.

Consider x have both @@iterator and @@reverseIterator, Iterator.from(x).map(f1).findLast(f2) still doesn't work, unless Iterator.from and map helpers could also provide @@reverseIterator (which invoke x[@@reverseIterator]).

Of coz u could write x[@@reverseIterator].map(f1).find(f2), but it force the author move the reverse step to the most upstream and also reverse transformations, and in many cases it's not possible (can't access the upstream) or at least very inconvenient.

a deiter as defined by this proposal is is "one iterator" which must be simpler than two iterators.

I didn't say one deiter must be simpler than two iterators.

Deiter could be a little bit complex, especially in the most simple case, "meet-in-the-middle" could be a marked logic. But if it's a simple case, this won't add too much complexity. You already see some example codes in my previous comments, and hope u could measure the complexity of such code in objective manner. In the complex case, for example, writing a deiter for tree traversal, the complexity mainly from the data structure, so "meet-in-the-middle" won't be a big deal. The total complexity of deiter and two iterator might be similar, but with deiter you get many other benefit, like maintaining iteration logic in one place not two, and supporting let [a, ..., b] = iterable which is the most important motivation of this proposal.

You haven't eliminated the need to write the second iterator, you've just ensured that the logic for both iterators has to share state and be intertwined.

I don't object anyone think a double-ended iterator as two interwined iterators, if it could help them to understand. Just like double-headed eagle might be have the source of two eagles:

The double-headed eagle originated in the ancient Near East. It was probably a Hittite symbol, to be more exact. The concept of dual eagles also existed in ancient Greece; Zeus (Jupiter) was said to let two eagles fly from the east and west ends of the world and they would eventually meet at Delphi, proving it as the center of the world. -- quoted from https://www.quora.com/What-is-the-original-meaning-of-the-symbol-double-headed-eagle-How-did-it-appear-as-a-symbol-in-the-Byzantine-Empire

image

conartist6 commented 2 years ago

We both want two types of iterator. You want single and double ended. I want forwards and backwards. Does that make me more Greek or Hittite?