tc39 / proposal-iterator.range

A proposal for ECMAScript to add a built-in Iterator.range()
https://tc39.es/proposal-iterator.range/
MIT License
487 stars 13 forks source link

Make Number.range() return re-usable value #17

Closed sffc closed 1 year ago

sffc commented 4 years ago

I would consider separating the iterator from the object returned from Number.range(), such that Number.range() returns an immutable object (with getter properties like .from, .to, etc), and then calling [Symbol.iterator] returns a "fresh" iterator with a .next() method. You can re-use the ranges. In other words, we should consider making the following code work:

const range = Number.range(0, 5);
for (let i of range) {
  // 0, 1, 2, 3, 4
}
for (let i of range) {
  // 0, 1, 2, 3, 4
}

This is what we are doing in the Intl.Segmenter proposal.

https://github.com/tc39/proposal-intl-segmenter/issues

CC @gibson042

littledan commented 4 years ago

This is sort of what I assumed the interface would be as well: Number.range() would return an iterable, so for-of loops work over it just fine. However, for the example in the README using iterator helpers, it'd be a little more verbose: you'd need Iterator.from(Number.range(...)).take(...). I could see the tradeoff either way, but mentally, for me, the "default" design would be an iterable.

Jack-Works commented 4 years ago

So reusable and easy to use iterator helpers are conflict... I prefer the latter one but the reusable is also important. Maybe add a .clone() to return a new clear RangeIterator with the same from, to and step?

sffc commented 4 years ago

We decided against returning an iterator from Intl.Segmenter.prototype.segment() after long discussions about ergonomics (see, e.g., https://github.com/tc39/proposal-intl-segmenter/issues/93, and other issues in that proposal). Basically, if the return value of Number.range(...) should be considered a "thing", then you should make it have its own prototype that's an iterable but not an iterator.

I think we should explore other options here. For example, if you write Number.range(...)[Symbol.iterator].take(...), does that feel more ergonomic than Iterator.from(Number.range(...)).take(...)?

Jack-Works commented 4 years ago

Iterator.from(...) and [Symbol.iterator] is basically the same in this case, they are both not ergonomics for chaining iterator helpers on it.

littledan commented 4 years ago

So reusable and easy to use iterator helpers are conflict...

Yes, this is a thing that's been under discussion for iterator helpers in general. It's also important to be able to use them for Arrays, which are iterable but not iterators.

I liked this story: You can use Iterator.from for these situations, and we expect this to be ergonomic enough. (If we don't, I wonder if we could make some change to the iterator helpers proposal for a more ergonomic path...)

devsnek commented 4 years ago

Iterators are already iterable, so the only discussion point seems to be re-usability. I'd argue if you want to reuse it you should call it again, it's more explicit and its weird to me that you'd need to make two calls (range()[Symbol.iterator]()) to begin consuming an iterator you directly asked for.

littledan commented 4 years ago

My intuition around the iterator protocol was that we're sort of generalizing part of the functionality Arrays. Arrays support [Symbol.iterator] to iterate through them, and other objects do too. I was picturing that Number.range would produce something which was also this same kind of "generalization of an Array", and that's why my intuition was that it wouldn't be an iterator, but rather a reusable iterable.

@Jack-Works Patching it up with .clone() would restore the functionality kinda, but it wouldn't meet the goal of making Number.range follow a common protocol to other iterable objects. Maybe we could make .clone() a more general thing, but I'd like to see if we could continue following the split of the initial iterable/iterator protocol, which already handles these two things.

@devsnek What in particular is weird about this? If you have a range function that returns an Array, you'll also need to call the Symbol.iterator method to iterate over it. Accordingly, since they should also work for Arrays, lots of usage sites will be calling Symbol.iterator anyway.

ljharb commented 4 years ago

It's the way generators already work - the generator is called (perhaps with arguments) to produce the (iterable) iterator. The generator itself is not iterable, you have to call it every time you want to iterate.

littledan commented 4 years ago

@ljharb I agree with this description of what generators are, but I don't see how that applies here and why Number.range shouldn't return something analogous to Arrays.

devsnek commented 4 years ago

The correct pattern here is definitely to return an iterator. If we want iterables for some reason the pattern you're looking for is a Number.Range class.

If it helps: Class -> Instance (Iterable or callable) -> Iterator Array -> array -> ArrayIterator Map -> map -> MapIterator Set -> set -> SetIterator String -> string -> StringIterator Number.Range -> numberRange -> NumberRangeIterator Generator -> generator -> GeneratorIterator (Number.range ~= generator) Array -> array.entries -> ArrayIterator Array -> array.values -> ArrayIterator Map -> map.entries -> MapIterator Map -> map.values -> MapIterator Map -> map.keys -> MapIterator ...

littledan commented 4 years ago

@devsnek OK, interesting analysis. I'll have to think about this some more. I'm wondering how/whether it'd apply to @sffc 's example of Intl.Segmenter (which is definitely an instance method, but also a slightly awkward case due to the random access methods).

devsnek commented 4 years ago

Another way to think of it: JS moved reusability out of the "Iterator" protocol itself (f() -> iterator is very convenient). Iterable is just the reification of "object with Symbol.iterator".

@littledan I'm admittedly not very familiar with the Intl.Segmenter proposal but a cursory glance seems to be this, which holds to the pattern but is kind of weird: Intl.Segmenter -> segmenter -> segmenter.SegmentsForString (via segmenter.segment() helper) -> segmentsForString -> SegmentsForStringIterator

It's worth noting that Intl has always had different patterns compared to the rest of the stdlib, most notably in their constructor patterns (Intl.NumberFormat(x).format(n) vs hypothetical Intl.formatNumber(x, n))

littledan commented 4 years ago

Well, I think Intl constructors are a good design that could make sense among anything TC39 creates, but we're getting a bit off-topic. For new APIs, I'd like to see if we can use common design patterns that make sense in general, and having Intl.Segmenter follow the iteration protocol (unlike Intl.v8BreakIterator) was intended to be part of that.

devsnek commented 4 years ago

my point was basically that both Number.range() and Intl.Segmenter().segement follow the pattern as far as i can tell, but i wouldn't really equate one directly to the other, because they have to (and do) represent different things.

On topic again, Number.range is itself the reusable bit (that is the generator pattern). You can just do const whateverRange = () => Number.range(from, to); and then you get the explicit bonus of having to do for (const item of whateverRange()) instead of for (const item of whateverRange) where its not immediately apparent whether or not you're doing something reusable (because as i said above, Iterable in js only means there is a Symbol.iterator method, not that it will create fresh state)

Jack-Works commented 4 years ago

Another thought: (working but it's a crazy design)

Add a @@iterator on the iterator itself to make it auto-cloneable.

const range = Number.range(0, 5);
for (let i of range) { } // use range[@@iterator] which return a cloned range
for (let i of range) { }  // use range[@@iterator] which return a cloned range
// range is not consumed yet.
range.take(...).toArray() // now it's consumed
devsnek commented 4 years ago

iterators already have @@iterator on them. But aside from that, doesn't

for (let i of range()) { }
for (let i of range()) { }

seem like a much more obvious way of saying you're using a separate range for each one? I don't understand why you want to make that implicit.

gibson042 commented 4 years ago

An Intl.Segmenter Segments instance, like an array/set/map/string/etc., is Iterable but not itself an Iterator—it is primarily a factory for constructing iterators, and also supports a containing method of its own for returning a single random-access iteration result without actually constructing an iterator. In our opinion, it would be negatively surprising for the same object to support both next() and containing(index), especially if index can refer to a position that has already been described by a previous iteration result, so we instead separated the two roles by defining the Symbol.iterator method on the iterable to always return a new iterator (again, just like array/set/map/string/etc.).

So the question here is whether Number.range is an iterator factory or returns an iterator factory. I personally can't imagine much use for the latter, and to me Number.range(0, 10) is much more like Array(10).keys() (a trivial iterator that basically supports only next()) than it is like Array.from(Array(10), (_, i) => i) (an object with its own state and methods that are independent of any constructed iterator(s)).

Jack-Works commented 4 years ago

If there are no objections, I'll change it to return a Range object with [Symbol.iterator] on it later. Wait for more discussions

Return an iterator

Current design. Problem:

Number.range(...): RangeIterator<number>

Behave like Symbol constructor

Problem:

Number.range(...): Range
class Range {
    [Symbol.iterator]
    get from()
    get to()
    get step()
}

A normal class

Problem:

class Range {
    constructor(from, to, step)
    [Symbol.iterator]()
    get from()
    get to()
    get step()
}
Number.Range = Range
BigInt.Range = Range
devsnek commented 4 years ago

I think I've objected to that idea a few times over at this point...

sffc commented 4 years ago

I think it's a question we should discuss at plenary, since not everyone is in agreement on this thread.

littledan commented 4 years ago

We discussed the issue in plenary, but I don't think we reached a particular conclusion. What are the next steps?

Jack-Works commented 4 years ago

I'm going to study the ranges in other languages to make a comparison then decide which pattern is better 👀👀

Jack-Works commented 4 years ago

I'm going to study the ranges in other languages to make a comparison then decide which pattern is better 👀👀

Interestingly, I found this re-usable problem in Rust. Rust is using the Iterator semantics. (Two for in doesn't pass the borrow checker so I manually call the next() to consume it.)

fn main() {
    let mut a = std::ops::Range { start: 3, end: 5 };
    print!("Used: {:?}\t", a.next());
    for i in a {
        print!("In loop: {:?}\t", i)
    };
}

Used: Some(3) In loop: 4

But I think manually call the next method is too explicit, and I think others try to consume the iterator twice will not pass the borrow checker. So I have no idea is this really a problem in Rust.

Jack-Works commented 4 years ago

In my recent research of range in other languages (not completed, you can see it at https://github.com/tc39/proposal-Number.range/blob/master/compare.md). I changed my mind and now preferring to the Iterable semantics now (or other ways to keep the iterator semantics but can be re-use safely).

But before switching to the Iterable semantics, there're a few problems that we need to resolve.

Use with Iterator Helper

// Now
Number.range(0, 10).take(5).toArray()
// After
Iterator.from(Number.range(0, 10)).take(5).toArray()

I have an idea at https://github.com/tc39/proposal-iterator-helpers/issues/78#issuecomment-642426096 but I'm not sure about it.

The naming problem

range for Iterator and Range for iterable (the idea of @devsnek)

Let it to be a class

// before
Number.range(0, 1)
// after
new Number.Range(0, 1)
// or Callable class
Number.Range(0, 1)

I have no idea if adding a new callable class like Array is acceptable today.

Let it to be a helper to create the Range class

Number.range(...) // implicitly calls new Range(...)

Another possible route: merge into Iterator namespace, becomes Iterator.range

Therefore the developers can clearly know, the return value is an iterator. And all iterators are not re-usable.

ljharb commented 4 years ago

Can you elaborate on why you think it would be better to add an entire class for something that seems only likely to be used to generate a single iterator?

Jack-Works commented 4 years ago

Can you elaborate on why you think it would be better to add an entire class for something that seems only likely to be used to generate a single iterator?

I don't like to add a class for it (the uppercase R, the new requirement to construct), but it seems like @devsnek is requiring a binding with Iterable with class.

If we want iterables for some reason the pattern you're looking for is a Number.Range class.

It is also possible to not adding a class, but a normal object with its internal slot and own prototype (for helper methods like includes). But I think if we do so, this kind of object is just a class instance without a constructor.

ljharb commented 4 years ago

The iterator Number.range returns would need to be an instance of Iterator, sure, but that doesn't constrain anything about Number.range itself.

To me all that's needed is a factory function for an iterator. "iterable" is just a protocol, like "thenable" or "toStringable" or "valueOfable" or "toJSONable" - there's no "iterable" class just like there's no "jsonable" class. I continue to be confused by this direction.

Re the OP's desire is for a reusable value; if we decide that's important, I'd expect Number.range(x, y) to return a thunk for an iterator with an own Symbol.iterator method on it that's return this() - ie, a function that takes no arguments and returns a new iterator each time - so that the immediate usage was Number.range(x, y)().

Jack-Works commented 4 years ago

Number.range(x, y) to return a thunk for an iterator with an own Symbol.iterator method on it that's return this() - ie, a function that takes no arguments and returns a new iterator each time

I like this design, it seems resolved almost all of the problems. The only problem is, is this API style consistent with other JS APIs? It seems like there is no prior API is designed like this.

If it is OK for others in this thread, I'd like to move to this API later.

ljharb commented 4 years ago

No, i don't think there's any precedent for expecting iterators to be reusable in the first place :-)

Jack-Works commented 4 years ago

No, i don't think there's any precedent for expecting iterators to be reusable in the first place :-)

Agree, I think we need to choose one of normal generator (current design) or a "generator factory"((...opts) => () => Iterator<number>). After all, an iterator is not re-useable. This problem exists for any generator functions. And I prefer generators to classes.

tabatkins commented 4 years ago

After reading over this thread again, I agree with Jordan. This should just return an iterator. That's favored by ergonomics (can use immediately, no need to convert to an iterator), and matches how it would be implemented by an author (as a generator function).

The Intl.Segmenter counter-example actually has use-cases for dealing with the segments as a reified object, of which iteration is just one of the operations you can perform over it. I don't think number ranges rise to that level, so it's not worth putting together an actual object representing the range. (We don't even plan to let you measure the length of the range; you just splat it into an Array and measure the length of that.)

Getting multiple instances of the same generator is a trivial: just call Number.range() again with the same args, or write a one-liner function that does it for you. Or if you wanna get fancy, you can make a Range class that lets you configure the options as properties, and calls Number.range() for its [Symbol.iterator] method.

But the core usage is trivial and small, and favors a plain ol' generator-like implementation, returning an iterator.

Jack-Works commented 4 years ago

@tabatkins how do you think about the latter idea Jordan mentioned? range() return an iterable generator to resolve the reusable problem

ljharb commented 4 years ago

to be clear; i don't personally consider it a problem, and imo the best solution is to make it return an iterator, and for those who want reusability, they can stick () => in front of their Number.range() call.

tabatkins commented 4 years ago

I think Number.range(10)() is a weird and unprecedented pattern, and we shouldn't introduce it here. As Jordan says immediately above, people can just make their own arrow functions if they want an easy way to create the same range multiple times.

Again, if you wrote this yourself, the obvious way to do so is with a generator function, which would just return an iterator. We shouldn't be innovating in surprising ways here without a pretty compelling use-case, which so far hasn't been presented.

Jack-Works commented 4 years ago

Can I think we have a consensus to use iterator semantics?

CC who seems like to support the iterable semantics: @sffc @littledan

And it's still possible to maintain the reusable by a @@iterator method on the iterator itself which will implicitly construct the range again with the same arguments (therefore return a fresh iterator for for...of Set Array.from usage).

Andrew-Cottrell commented 4 years ago

And it's still possible to maintain the reusable by a @@iterator method on the iterator itself which will implicitly construct the range again with the same arguments (therefore return a fresh iterator for for...of Set Array.from usage).

I can imagine a possibility where I might pass the result of a range call to a function that accepts any Iterator instance. Such a function might first get some values and then consume any remaining values with either a for-of loop or Array.from. I think, for this to work as expected, the iterator's @@iterator method would need to return the this value. However, I cannot think of a plausible example at the moment, so perhaps this use case does not exist or is very rare.

Is there a precedent for a built-in iterator's @@iterator method returning anything other than the this value ⁠— I did not find one ⁠— and would it be a surprising behavior?

Jack-Works commented 4 years ago

I'll close this to keep the non-reusable iterator semantics. I think the current semantics is okay, developer will learn to know that iterator is not reusable because every iterator behaves like this. Just like they have to know Array.sort is mutating the original array. Happy to see if there are any further suggestions.

hax commented 4 years ago

I still hope this could be reconsider. There are many arguments about reusable issue in iterator helpers and seems no consensus. My suggestion is we'd better not spread non-reusable semantics in the languages before we have real, solid consensus.

Note, currently there are only values/keys/entries methods return iterator directly in the languages and web apis. But they are methods with no param, so they do not have reusable issue because u can just reuse the objects.

The only exception I know is matchAll which we recently add, not sure whether it was discussed to return iterable or iterator. But at least it seems there is no many reuse use cases for string match result, and even there was reuse use case, it only have one param (regexp, mostly literal or const) and not too hard to reuse. On the other side, I feel range have many reuse use cases, and range have 2 params with an optional object options, this make the reuse much harder than matchAll.

ljharb commented 4 years ago

It’s already reusable as a function; matchAll is also reusable with a string and a regex inside a function.

We don’t need to produce nouns (objects) when we have verbs (functions) available.

hax commented 4 years ago

It’s already reusable as a function

Not sure what u mean of "reusable as a function", do u mean create a reusable closure ? Yes we could, but it add extra cost, actually it's just like creating a reusable iterable manually. If it's the common case, why we not provide an API return iterable instead of iterator?

ljharb commented 4 years ago

Yes, i mean stick () => in front of it - I’m not sure what cost that adds.

I don’t actually think it’s a common case - personally i think the common case will be to make a range and use it once. But, just like every other one-use operation in the language, it has the easy composability of “put it in a function” to make it reusable. Addition isn’t reusable either, and that hasn’t stopped anyone from making reusable addition functions :-)

Jack-Works commented 4 years ago

I'll leave this decision to the July meeting, hope we can find the correct route

hax commented 4 years ago

I really feel there are many cost if range returns iterator.

Learning/understanding/remembering cost:

Programmers need to remember range() returns iterator and can't be reused, this is very different to range in python, _.range in lodash, range in ix (mention ix package because iterator-helpers list it as prior art)..., and I will argue that the intuition of the name "range" is it should be immutable and reusable.

Error-prone:

Even programmers have known that range() returns iterators, there are many cases we start from a simple one-time usage range like:

...
const numbers = Number.range(start, end, {step})
consume(numbers)

Nothing wrong about such usage, but with the time pass, it's possible the code become like:

import {numbers} from './common'
consume(numbers)

Now, when we reuse numbers for any reason one day, we just introduce a bug. Note it's hard to recognize numbers is an iterator.

Such accidents could occurs in various form.

// a very correct usage!
export function f() {
  const numbers = Number.range(start, end, {step})
  return consume(numbers)
}

After careless refactoring

import {numbers} from './common'
export function f() {
  return consume(numbers)
}

Note it's likely u still could pass all unit tests of f, unless u have a test to call f() twice. And fortunately (or unfortunately), there is no production code call f() twice, so depend on what "bug" means for u, it could even not be seen as a bug at all.

Refactoring cost:

We finally find the potential issue of f() , how to fix it?

We could rewrite numbers to a closure like @ljharb suggested, but u also need to change consume(numbers) to consume(numbers()) which means u need to change all exist client code use numbers.

So a better solution is just make numbers iterable object:

export const numbers = {
  [Symbol.iterator]() { return Number.range(start, end, {step}) }
}

Another small problem is, the parameters are possibly expressions. So simply add () => or wrap it to iterable may be not efficient or even wrong, so the final code would like

const start = exp1, end = exp2, step = exp3
export const numbers = {
  [Symbol.iterator]() { return Number.range(start, end, {step}) }
}

If programmers finally write such code, they will ask why range is not iterable in first place so they can just keep simple export const numbers = Number.range(exp1, exp2, {step: exp3}) without any trouble?

hax commented 4 years ago

I'll leave this decision to the July meeting

@Jack-Works If that, we'd better reopen this issue before we have final decision.

devsnek commented 4 years ago

After careless refactoring

This refactoring hazard doesn't seem realistic to me. You can make that mistake with literally any value specific to a given function invocation, being an iterator doesn't exacerbate that problem.

function x() {
  const time = Date.now();
  consume(time);
}
function add(a, b) {
  const result = a + b;
  consume(result);
}

You are continuously describing the need for reusable logic, not reusable data. Luckily, it is extremely easy to create a function in JS, so I don't think we need to worry about it.

u also need to change consume(numbers) to consume(numbers()) which means u need to change all exist client code use numbers.

That's literally a single location. If it is multiple locations that means the function code was already broken because it was trying to consume the same iterator multiple times.

tabatkins commented 4 years ago

@hax: Nothing in your argument is specific to Number.range(); it's a generic argument against the concept of generators in general (or perhaps against ever exposing a generator from spec-defined functions).

If you want to argue against the core language ever using generators, that's fine (but I doubt you'll succeed). If you want to argue that generators in general are fine, but there are specific reasons that Number.range() is bad as a generator, that's fine too (but I don't see any). But you can't make a generic argument against generators and then only apply it to this specific case.

hax commented 4 years ago

@devsnek I don't get your example, time and result in your examples seems are immutable values and can't compare to iterators.

it is extremely easy to create a function in JS

Yes it's extremely easy but only make sense if the programmers understand they need to create a function.

@tabatkins I'm arguing :

  1. Many people don't aware the difference between iterator and iterable
  2. Even they know, most will not expect range is not iterable, especially most other programming languages use iterable for range.
  3. Before this proposal we don't have builtin and web APIs returns iterators directly except values/keys/entries and matchAll.

Point 2 is specific to Number.range and point 3 is much general.

About generator, I do not argue against using generators, but more like suggesting the best practice of using generator. As my previous comments about iterable vs iterator, generators should be used for building components or ad-hoc usage, not for public api directly, if use for public api, we'd better only use it on methods with no argument, so there will be no reuse issue.

devsnek commented 4 years ago

@hax it's not about mutability, it's about how you can't factor logic out without wrapping it in a function. if you factor Date.now() out without wrapping it in a function you'll get the wrong time. if you factor a+b out without wrapping it in a function that also handles the values you'll get a reference error. Because this is already how js works, I think programmers have a good understanding that reusing something often requires wrapping it in a function.

hax commented 4 years ago

@devsnek I still don't agree they are comparable. Programmers expect Date.now() returns different values on every single call. This is not true for most other things includes range. And in the example a + b, a and b is local variables, if the cases like my example (a and b are const outside the function), u could of coz fact a+b out, and u should, because compute it every time is wasting.

Jack-Works commented 4 years ago

@hax suggest that we can add a .values() on the %RangeIteratorPrototype%. Then it can be used like

range(a, b).values().map(...).take(...).toArray()

I think this is the perfect way to resolve the ergonomic with Iterator Helpers. Therefor I'm prefer Iterable now.