tc39 / proposal-iterator-helpers

Methods for working with iterators in ECMAScript
https://tc39.es/proposal-iterator-helpers
1.33k stars 35 forks source link

Consider making flatMap throw if the mapper returns a non-iterable #55

Closed bakkot closed 4 years ago

bakkot commented 5 years ago

In Array.prototype.flatMap, if the mapping function returns a non-array, it is treated as if it returned an array of length 1. That was to preserve symmetry with .flat.

Here, I think it makes sense to be more strict (that is, to throw), for three reasons:

1) There's no .flat to keep parity with. 2) This version of .flatMap flattens all iterables, not just arrays. In particular, it flattens strings. I think anyone relying on the auto-boxing behavior would find that surprising, so I think we should just not have the auto-boxing behavior. 3) Auto-boxing would mean that adding a Symbol.iterator method to any object which did not previous have it would be a breaking change.

bergus commented 4 years ago

@zloirock As another example, consider recursively flatMapping a DOM tree to an iterator of node lists. Remember when NodeList became iterable and got its forEach method?

zloirock commented 4 years ago

@bergus ...and? In cases like that, on objects, sure, better wrap the result to something iterable, like an array, but why should I wrap to array primitives? Why should I write a callback several times longer and sacrifice performance? Why should I remember one more difference between this and Array method?

zloirock commented 4 years ago

On Array method we have one semantic. On this method, you propose another. On collections method we will have third. On another abstraction - fourth... Hail JS!

Jamesernator commented 4 years ago

I'd still like to point out the option to flatten other Iterators only:

e.g.

function* flatMap(mapperFn) {
  for (const value of this) {
    const mapped = mapperFn(value);
    if (hasSlot(mapped, [[Iterated]])) {
      // flatten mapped
    } else {
      yield mapped;
    }
  }
}
instead of ```js function* flatMap(mapperFn) { for (const value of this) { const mapped = mapperFn(value); if (mapped[Symbol.iterator]) { // flatten mapped } else { yield mapped; } } } ```

This gives fallback behavior for non-iterators, avoids the no-upgrade to Iterable hazard as we don't flatten them, and is similarly symmetric with Array (.flatMap only flattens the same type, if you want to flatten an iterable you can use Iterator.from/.values()/[Symbol.iterator]()/etc).

devsnek commented 4 years ago

Assuming you mean iterator records ({ [[Iterator]], [[NextMethod]], [[Done]] }), those are never exposed to js code.

Jamesernator commented 4 years ago

Assuming you mean iterator records ({ [[Iterator]], [[NextMethod]], [[Done]] }), those are never exposed to js code.

Looks like I was assuming [[Iterated]] was something on all Iterator instances.

Edited pseudocode:

function* flatMap(mapperFn) {
  for (const value of this) {
    const mapped = mapperFn(value);
    if (mappped instanceof Iterator) { // NOT checking for Iterable
      // flatten mapped
    } else {
      yield mapped;
    }
  }
}
devsnek commented 4 years ago

Being an instance of Iterator is not a requirement of the iterator protocol, so we shouldn't use that as a check.

bergus commented 4 years ago

@Jamesernator I think it has been mostly agreed about in the discussion https://github.com/tc39/proposal-iterator-helpers/issues/47#issuecomment-535230523 that flattening iterables is much more useful than flattening iterators. But regardless which protocol we check for (by testing the existence of a next method for iterators, or a Symbol.iterator method for iterables), this issue is discussing what should happen when the check fails: should it throw an exception or should it yield the wrong value from the resulting iterator?

ExE-Boss commented 4 years ago

I think it should throw an exception, line any sane programming language would.

ljharb commented 4 years ago

Let’s avoid terms like “sane”, please.

Jamesernator commented 4 years ago

I think it has been mostly agreed about in the discussion #47 (comment) that flattening iterables is much more useful than flattening iterators.

The same was argued for Array.prototype.flatMap.

And if we agree operating on iterables is strictly more useful I'm still not sure why this proposal isn't for operating on iterables instead.

this issue is discussing what should happen when the check fails: should it throw an exception or should it yield the wrong value from the resulting iterator?

I think we should avoid discussing specific features and instead focus on the problems. The reason I came up with Iterator-only flattening was so that there'd be a middle ground between not having the upgrade-to-iterable hazard while also providing a solution for people who want to mix.

bergus commented 4 years ago

@Jamesernator If we did iterator-flattening instead of iterable-flattening, the issue would be an upgrade-to-iterator hazard instead. (Which probably might happen even more accidentally, as people don't realise a next method implements the Iterator interface). What we really want here is to prevent people from mixing non-iterable/non-iterator values in the flatting. It's just unsafe, and unnecessary for most use case anyway.

conartist6 commented 4 years ago

My background: I maintain iter-tools. My thoughts:

I definitely think flatMap should throw on non-iterables, for all the reasons described here. I don't think that as @zloirock suggests this will mean that people wrap [...iterable] around their flatMap callback returns. It just means that plain iterators which are not also flagged as iterables using [Symbol.iterator]() { return this } are nearly useless because lacking a (safe) means to detect them reflectively the programmer must just know what they have. I'm presuming that the iterators returned by methods like flatMap will have [Symbol.iterator]() { return this }, as all generators do. If all generators, iterators transformed from generators, native data types, and iterators returned from libraries like iter-tools, itertools, ix, Immutable.Seq, etc, are actually iterables, I don't see where these naked iterators that people are going to wrap with Arrays are really going to be originating. They shouldn't exist.

If I'm being really honest, I don't even think iterator should be a separate protocol. It's just needlessly confusing. It is in essence a subpart of the iterable protocol, which interestingly does not specify whether separate iterators returned from Symbol.iterator() do or do not share state, meaning that you must always assume that they do (that all iterables are merely iterators) unless you as the programmer surely know otherwise.

I also would strongly support special casing string. Nothing is gained by turning a string (an iterable of length one strings) into an iterable of length one strings, save throwing away irrevocably the fact that the iterable contains only characters and is safe to use with string APIs. In fact it isn't, and even if you did know you had an iterable of characters, there is no API for turning an iterable of characters back into a string. Finally any kind of optimization that the javascript engine had for the string is lost. Scratch that, because this is flatMap not flat you just shouldn't return a string from your callback and you're fine.

Jamesernator commented 4 years ago

If we did iterator-flattening instead of iterable-flattening, the issue would be an upgrade-to-iterator hazard instead.

Note I am specifically suggesting only flattening those that inherit Iterator. This seems fine given you need to wrap with Iterator.from for iterators that don't inherit Iterator anyway to even get the methods.