tc39 / proposal-iterator-helpers

Methods for working with iterators in ECMAScript
https://tc39.es/proposal-iterator-helpers
1.33k stars 35 forks source link

Consider making flatMap throw if the mapper returns a non-iterable #55

Closed bakkot closed 4 years ago

bakkot commented 5 years ago

In Array.prototype.flatMap, if the mapping function returns a non-array, it is treated as if it returned an array of length 1. That was to preserve symmetry with .flat.

Here, I think it makes sense to be more strict (that is, to throw), for three reasons:

1) There's no .flat to keep parity with. 2) This version of .flatMap flattens all iterables, not just arrays. In particular, it flattens strings. I think anyone relying on the auto-boxing behavior would find that surprising, so I think we should just not have the auto-boxing behavior. 3) Auto-boxing would mean that adding a Symbol.iterator method to any object which did not previous have it would be a breaking change.

ljharb commented 5 years ago

Does this mean that a flatMap callback would be unable to return a non-iterable?

bakkot commented 5 years ago

Doing so would cause .flatMap to throw, yes.

ljharb commented 5 years ago

That seems highly unusable - it means i always have to return an iterable, which effectively means everyone will slap [ ] around their values.

bakkot commented 5 years ago

If you are using .flatMap, presumably it is because you want to return iterables.

ljharb commented 5 years ago

I use flatMap when I want to not care about whether I'm returning an iterable or not.

michaelficarra commented 5 years ago

@ljharb How do you want strings returned by your mapping function handled?

bathos commented 5 years ago

adding a Symbol.iterator method to any object which did not previous have it would be a breaking change

Would this not be true regardless with flatMap? (i.e. return [ objThatLaterGetsIterator ])

bakkot commented 5 years ago

Would this not be true regardless with flatMap?

Generally speaking we regard turning an error case into a non-error case as being not breaking, so I don't think so.

i.e. return [ objThatLaterGetsIterator ]

That would never attempt to treat objThatLaterGetsIterator as an iterator, so it wouldn't matter. flatMap doesn't recursively flatten; regardless of whether objThatLaterGetsIterator has Symbol.iterator or not, the object itself would be in the output of the resulting iterator, not its contents.

ljharb commented 5 years ago

Doesn’t flatMap take a depth argument? I assume I’d use that to determine if i wanted my string iterator consumed or not (in which case yes, I’d wrap strings in an array, but forcing that for every other primitive seems harsh)

bakkot commented 5 years ago

Doesn’t flatMap take a depth argument?

No.

ljharb commented 5 years ago

ah right, that’s only .flat - forget about that part :-)

so then yes, it’s that for strings, because they’re iterable, I’d have to wrap them in another iterable - but forcing that on all the other primitives (and non iterable objects) seems like a high price to pay.

bakkot commented 5 years ago

I think the main use case for flatMap is when you want to return an iterable from the mapping function - otherwise you'd just use map. So this price really does not seem that high to me, compared to the benefits listed in the OP.

forcing that on all the other primitives (and non iterable objects) seems like a high price to pay

Given that relying this fallback mechanism requires remembering which things it's going to iterate and which it is going to leave alone, I really don't think the benefit is there. Like it says in the OP, anyone who's gotten accustomed to relying on it for the other primitive types is going to be really confused by the behavior for strings. Better that they just use the consistent interface.

devsnek commented 5 years ago

in the interest of covering all the bases, we could also special case strings

bakkot commented 5 years ago

I would be very strongly opposed to special casing strings.

ExE-Boss commented 5 years ago

This is how this works in Java 8’s streams, where the flatMap()’s functional interface argument implementation has to return a Stream instance.

This is also why Java has the static Stream.of<T>(T... values) and Stream.ofNullable<T>(T t) methods.

zloirock commented 5 years ago

I'm not a fan of this. I'm for better consistency with Array.prototype.flatMap.

zloirock commented 5 years ago

In the future, to iterators prototype could be added .flat and we will lose the symmetry.

Jamesernator commented 4 years ago

I'm not a fan of this. I'm for better consistency with Array.prototype.flatMap.

Note that it was explicitly decided against flattening iterables because they weren't Arrays. Similarly tterables aren't iterators either so in my opinion the array thing isn't a strong argument.

Instead it could be an option to just flatten other Iterators (not Iterables).

Iterator.from([1,2,3,4])
  .flatMap(i => i)
  .toArray(); // [1,2,3,4]

Iterator.from([1,2,3,4])
  .flatMap(i => [i, i]);
  .toArray(); // [[1,1], [2,2], [3,3], [4,4]]

Iterator.from([1,2,3,4])
  .flatMap(i => Iterator.from([1,1])) // Iterator flattening
  .toArray(); // [1,1,2,2,3,3,4,4]
zloirock commented 4 years ago

@Jamesernator for better, but not for full -)

The maximum consistency of methods of different classes is a good practice for languages and minimizes language learning problems.

Some difference between methods makes sense, like usage iterables in flatMap or only one argument of callbacks of methods with a callback.

Some - just makes more complexity and make language learning more difficult, like removing optional thisArg of methods with callbacks (legacy, but could not cause any problems) or this proposal.

bergus commented 4 years ago

@ljharb

I use flatMap when I want to not care about whether I'm returning an iterable or not.

You might come from a different mindset, but that's really not how programming should work. Relying on dynamic types makes code hard to maintain and implementations hard to optimise. We suffered from this problem with the "promise or thenable or maybe not and just a plain value instead" return value of the promise then method's callback - this is the kind of overloading that leads to unexpected errors (objects with then methods there, iterable strings here) and made promises harder to understand for many learners. Please don't make the same mistake again.

One should use map when one does not want the values to get unwrapped, signalling to the reader that the callback might return any value. One should use flatMap when one does want the values to get unwrapped, signalling to the reader that the callback will return an iterable.

If one does not care and wants the code to work with both, one should explicitly call a wrapIfNotIterable function. (And a wrapIfNotIterableOrString could deal with the string problem). This is not the job of flatMap - if it was, it would be called flatMapWherePossible.

it means i always have to return an iterable, which effectively means everyone will slap [ ] around their values.

I'm not sure where you'd need that - most of those cases should probably be using map instead. If you are thinking of callbacks that might return either an iterable or just a plain value, like

function filterValuesByKey(iterator, predicate) {
     return iterator.flatMap(([key, value]) => predicate(key) ? value : [])
//                   ^^^^^^^ assuming the overloaded version
}
new Set(filterValuesByKey(map.entries(), …))

then I would argue that predicate(key) ? [value] : [] is actually the right way to do that, as the version without [ ] is broken: it flattens iterable values, which is not desirable.

devsnek commented 4 years ago

One should use map when one does not want the values to get unwrapped, signalling to the reader that the callback might return any value. One should use flatMap when one does want the values to get unwrapped, signalling to the reader that the callback will return an iterable.

This is very motivating to me. However, we do still have Array#flatMap which behaves differently.

bergus commented 4 years ago

@devsnek Tbh I have exactly the same concerns about Array.prototype.flatMap, only there it is already too late to fix.

ExE-Boss commented 4 years ago

@bergus

function filterValuesByKey(iterator, predicate) {
    return iterator.flatMap(([key, value]) => predicate(key) ? value : [])
}

If you don’t want value to be iterated, then instead of:

function filterValuesByKey(iterator, predicate) {
    return iterator.flatMap(([key, value]) => predicate(key) ? [value] : [])
}

You’ll want the following, which is more performant, because it doesn’t allocate extra arrays, which is why this proposal is being made in the first place:

function filterValuesByKey(iterator, predicate) {
    return iterator.filter(([key /*, value*/]) => predicate(key))
}
zloirock commented 4 years ago

It's not just wrapping of something to array literal, it's also 2 useless microtasks (count the number of created promises) for async version on each element and not cheap internal logic for unwrapping. For example, flatten case: image

bergus commented 4 years ago

@ExE-Boss Yes, one could write iterator.filter(kv => predicate(kv[0])).map(kv => kv[1]) as well and avoid flatMap altogther, however this use case is one of the more common ones for using flatMap with returning array literals. (Related are mapMaybe in Haskell, filterMap in Rust, Elm, Purescript, flatMap/etc itself in Scala and other languages that have iterable Option types). It's just the simplest example I could come up with - see this post I found on the web for a little variation. Maybe @ljharb was thinking of something different anyway.

@zloirock As I said, it's more about correctness. AsyncIterator.from(["12", [3, "4"], 5, "67"]).flatMap(it => it) does not give the desired results when you only want to flatten arrays, especially arrays whose element type you do not know (in a library function). That matters much more than how many microtask an async iterator needs or how easy array literals are to inline for an engine.

zloirock commented 4 years ago

@bergus I'll have [the result what I want](http://es6.zloirock.ru/#AsyncIterator.from(%5B%2212%22%2C%20%5B3%2C%20%224%22%5D%2C%205%2C%20%2267%22%5D)%0A%20%20.flatMap(it%20%3D%3E%20it)%0A%20%20.toArray()%0A%20%20.then(log)%3B) since when was proposed adding flatMap, was decided to flatten iterable objects, not primitives.

brainkim commented 4 years ago

@zloirock Hi off-topic sorry but what are you using to count used microtasks in your screenshot? https://github.com/tc39/proposal-iterator-helpers/issues/55#issuecomment-547149945

bakkot commented 4 years ago

I agree with @bergus above, but also I think the third point in my OP is sufficiently compelling on its own:

Auto-boxing would mean that adding a Symbol.iterator method to any object which did not previously have it would be a breaking change.

I don't think that would be acceptable. So I don't think auto-boxing is acceptable. We don't even need to argue about which behavior is more desirable (though I think the not-boxing behavior is more desirable for the reasons given above).

zloirock commented 4 years ago

@brainkim just patching of built-ins -)

devsnek commented 4 years ago

Right now I kind of feel like there is no choice that results in a flatMap I feel proud of releasing to the ecosystem. Maybe we should defer it?

bergus commented 4 years ago

@zloirock Yeah, using strings in the example was a bit shortsighted as you wanted to make an explicit exception for them, I meant to use arbitrary iterable or not-iterable non-array values (which might as well could have been objects). @bakkot has put this better than I could, I won't repeat his argument but I do fully support it.

bakkot commented 4 years ago

@devsnek #59 is the correct thing. Just do that.

devsnek commented 4 years ago

I agree that we definitely shouldn't have the current version, and that the throwing version is more consistent and predicable. I just also dislike the throwing variant from a subjective design standpoint. I wouldn't really care if this proposal didn't have flatMap, so my question was more about whether other people would really strongly be bothered if it didn't.

zloirock commented 4 years ago

@bakkot correct? Inconsistency with Array method, verbosity and performance problems? Thanks, no.

bakkot commented 4 years ago

@devsnek Yes, people would be bothered. See other threads. Just ship it with the correct behavior.

zloirock commented 4 years ago

@bergus I don't see any bulletproof argument. Custom iterables? We should handle them in both cases. Or not, if return of non-iterables is allowed and we expect only primitives or non-iterable objects as the output of flatMap.

michaelficarra commented 4 years ago

@devsnek If the goal of this proposal is to add the most common and fundamental higher-order functions on iterators, flatMap is essential.

devsnek commented 4 years ago

@bakkot 59 has a review comment you should attend to. also, you should consider how you address others, "Just do it", "Just ship it", etc. isn't very polite.

zloirock commented 4 years ago

@bakkot if you don't wanna return something non-wrapped from .flatMap - just don't do it by yourself or configure your linter - it will work anyway. Please, do not force others to do so.

bakkot commented 4 years ago

@devsnek Sorry for brusqueness; I'm not at a keyboard.

bergus commented 4 years ago

@zloirock Yes, in the flattening-arbitrary-values use case one would need to decide how to handle custom iterables as well, just as one would need to decide how to handle strings. However, from all this discussion I gather there are many different approaches, and I would therefore prefer flatMap not to ship with some hard-to-remember special-case wrapping behavior but force users to be explicit about which values they want to preserve and which they want to unwrap.

But anyway, I consider flattening of arbitrary values to be an edge case that rarely happens in the real world (in good code?). Please focus more on my filter-map use case, or something more specific like

function duplicateEverySecondValue(iterator) {
    return iterator.flatMap((x, i) => i % 2 ? [x, x] : [x]);
}

Of course it's a toy example again, but these have the general structure (with possibly more cases and more elaborate conditions) of 90% of the use cases where a flatMap callback would sometimes return [an iterable of] a single value. And in all those cases, returning plain x instead would be a bug that you would discover once x will be unexpectedly iterable.

zloirock commented 4 years ago

@bergus if you are afraid of such an error - you can write a rule for a linter in a few minutes. I prefer brevity, performance, and better consistency with the Array method.

zloirock commented 4 years ago

About handling strings - it's not something special - we already have a precedent - typed array constructors.

bakkot commented 4 years ago

@zloirock You can't fix this with a linter. Suppose you ship a website which has let memories = iter.flatMap(x => x === 0 ? null : new WebAssembly.Memory({initial: x}))) in it. Later, the web platform adds Symbol.iterator to WebAssembly.Memory.prototype. Now your website is broken.

zloirock commented 4 years ago

@bakkot you can write a rule which enforces [] to .flatMap callback result if you need it. You suggest a strange example - the web platform will not add Symbol.iterator to something if it can break the web. Moreover - why?

bergus commented 4 years ago

@zloirock It would need a typechecker, not a linter, to recognize the need. (Btw, typechecking is another argument in itself: an overloaded flatMap callback return type not only doesn't let a programmer infer types properly, but also any typechecking algorithm is hindered by this).

zloirock commented 4 years ago

@bergus see above.

bakkot commented 4 years ago

@zloirock The web platform will generally try to avoid shipping breaking changes, yes. Which means that if we add this feature, and people start using it, they will never be able to add Symbol.iterator to something which currently lacks it. Which is bad.

zloirock commented 4 years ago

@bakkot your example too far-fetched, so don't worry about it -)

bakkot commented 4 years ago

@zloirock I don't think it's far-fetched. It is reasonable to make things iterable. For example, the Streams spec added Symbol.asyncIterator to ReadableStream just this year, and it hasn't yet shipped in browsers, despite ReadableStream shipping for years.