tc39 / proposal-flatMap

proposal for flatten and flatMap on arrays
https://tc39.github.io/proposal-flatMap
214 stars 19 forks source link

Should we support flattening iterables instead of concatSpreadables? #8

Closed zenparsing closed 7 years ago

zenparsing commented 8 years ago

Should the following work?

function* g() { 
    yield 1; yield 2; yield 3;
}

[0, 0, 0].flatMap(_=> g());
// [1, 2, 3, 1, 2, 3, 1, 2, 3]

Currently, it doesn't work because flatMap and flatten basically use concat, which only works for Arrays and @@isConcatSpreadables.

ljharb commented 8 years ago

To confirm: as written, you'd need [0, 0, 0].flatMap(_=> [...g()]); to get the output you want?

zenparsing commented 8 years ago

@ljharb That's right.

ljharb commented 8 years ago

If it flattened iterables, then wouldn't returning a string never work as intended, because it'd spread the string out into code points?

zenparsing commented 8 years ago

Good question. Let's say that we continue using @@isConcatSpreadable. Should it be an error if you return something from the mapper that isn't spreadable?

Or, if we used iterables, should it be an error if you return a non-iterable?

It seems like that would be a programmer error, but is there a use case?

ljharb commented 8 years ago

Should it be an error if you return something from the mapper that isn't spreadable?

I should be able to use flatMap to return either an array, or a non-array, and end up with only non-arrays down to the specified depth. If I understand your question correctly, I think it's entirely expected to return non-spreadable things - just like [1, [2]].flatten() would return [1, 2], i'd expect var v = [1, [2]].flatMap(x => x); to return the same.

zenparsing commented 8 years ago

I'm actually wary of explaining flatMap and flatten in the same way, for this reason.

What does flatMap do in other languages or libraries? I know that in RxJS, if you return something that isn't "observable" from a Rx.Observable.prototype.flatMap mapper, you'll get an error. I'm pretty sure that if we ever implemented Iterator.prototype.flatMap it would also error if the mapper returned a non-iterable.

bterlson commented 8 years ago

@zenparsing Ruby will allow flatMap to return anything. I also think it aligns with concat to allow it. Seems potentially handy. Worth looking around. Also @mattpodwysocki might have thoughts.

Re: array-like vs iterable protocol, both are probably useful depending on what you want. If it's not important that flatMap and concat have similar semantics I'd be fine flattening iterables. We could also say that A.p.flatMap looks for array-likes and Iterator.p.flatMap works on iterables? The downside there is Iterable.prototype.flatMap is less easy to use on arrays which is the primary use case.

Jamesernator commented 7 years ago

I'd rather see semantics based on whatever algorithm is used in Array.from, I would definitely find being able to flatMap using generator functions very useful for a few more complicated mappings, and it gives the benefit of still allowing ArrayLikes.

I'm not sure what would be done with other objects that Array.from just gives an empty array on, but I know that personally I've never used or had a case where flat-mapping a mix of iterables and non-iterables would have actually been an intuitive solution. So obviously solutions for dealing with flattening those objects should probably be suggested by people who have actual examples of where they've flatMap-ed mixed arrays.

bterlson commented 7 years ago

FWIW, a year later I'm convinced iterables are the right approach :) Alignment is not necessary (especially since from already uses iterables).

ljharb commented 7 years ago

@bterlson what about https://github.com/bterlson/proposal-flatMap/issues/8#issuecomment-191853768 ?

bterlson commented 7 years ago

@ljharb I think you get individual characters, yes. I'm not convinced this is "unexpected". Put another way, I think you have to pick which is more unexpected - getting individual characters, or not allowing iterables...

ljharb commented 7 years ago

Then how would I flat map an array of strings? Every use case with strings isn't "give me code points".

What about something simple like files.map(path => readFileSync(path).split('\n')).flatMap(processLine) - where I want an array of lines back, not an array of every code point in all the lines in all the files?

I think not allowing iterables is very expected in any array API, since the vast majority of them do not.

bterlson commented 7 years ago

@ljharb: new ones do seem to allow iterables? (Array.from at least)

If you want an array of strings then you return an array of strings rather than just a string.

ljharb commented 7 years ago

Indeed, Array.from is the sole (i think) new Array method that takes an iterable; I'd argue that's a special case.

I suppose if the depth defaulted to 1 in flatMap, i'd just be sure that the depth wasn't too deep to grab the strings, so the use cases I'm concerned about aren't impossible, nor particularly difficult, to achieve (with that default).

Without explicitly specifying the depth (and thus knowing exactly what items I was flatmapping over), would there be any way to specify that I wanted flatMap to function like concat (with, say, an infinite depth)?

zenparsing commented 7 years ago

@ljharb Are strings as base values the only reason that one would prefer the concat behavior?

Any sane iterator-based flatMap would flatten iterables, and you would have the same string-spreading problem. That leads me to believe that adding Symbol.iterator to String.prototype was an unfortunate mistake. We should have added a method to get an iterator.

ljharb commented 7 years ago

@zenparsing I agree that an iterator-based flatMap would flatten iterables, naturally :-)

I also agree that if String.prototype.codePoints returned an iterator, rather than String.prototype[Symbol.iterator], I would have no concerns about using iterators here - that pattern would have also allowed us to have a codeUnits iterator, a codePoints iterator, and later, a graphemes iterator. Not impossible, but perhaps a missed opportunity.

However, I think this hazard also exists for every iterable type - I might want a flatMap of Maps, or Sets, and I'd need to do the same "tight coupling of the depth argument to knowledge about the contents" there to avoid losing my O(1) lookup.

Perhaps flatMap should be usable in two modes: concatSpreadable, and iterable? (even if the default is "iterable")

Jamesernator commented 7 years ago

@ljharb I feel like it'd be more appropriate to have another method like .concatMap, passing in a mode to .flatMap would feel a bit weird e.g.:

[1,2,3,4,5].flatMap(
    x => someTimesStringSomeTimesArray(x),
    undefined, /* thisArg */
    'concat'
)
// The thisArg is a bit of a pain, obviously they could be swapped or would
// that be too confusing with .map(func, thisArg), also does anyone actually
// use thisArg, I've certainly never used it on .map, but breaking parity
// with .map probably isn't a good idea 

// vs
[1,2,3,4,5].concatMap(x => someTimesStringSomeTimesArray(x))
michaelficarra commented 7 years ago

I prefer isConcatSpreadable here, for two reasons:

  1. This is exactly equivalent to [].concat(...xs), the most common implementation of [].flatten today.
  2. For those who work with heterogeneous arrays (shudder), the iterable behaviour may be surprising when they did not expect that thing to be iterable.
Jamesernator commented 7 years ago

I don't really know what should be done with .flatten given that I don't see myself ever needing it. But I've definitely never had a case where I've had arrays like this: ['weirdly', ['mixed', 'data']].

That alone makes it extremely un-compelling to me that people would rather have the ability to do this:

// Really what's a real use case of this type of flattening?
// How does this sort've data arise?
const dat = [
    'who', 
    ['even', ['has']], 
    'data', 'like', 
    ['this'],
    'in the first place'
    [['?']],
].flatten()

Over this:

function search(files, string) {
    return files.flatMap(function*(file) {
        for (let i = 0 ; i < fileLength(file) ; i++) {
            if (readChunk(file, i, string.length) === string) {
                yield { file, index: i }
            }
        }
    })
}

search(['foo.txt', 'bar.text'], ['fizzbuzz'])
bakkot commented 7 years ago

@Jamesernator, it is perfectly reasonable to represent a tree whose leaves are not necessarily all at the same depth as a heterogenous array, and to expect .flatten to work on a such a data structure.

In general, "I would never use this" isn't all that compelling of an argument.

ljharb commented 7 years ago

Considering array.concat(notAnArray, maybeAnArray, alsoNotAnArray) is a pretty common use of concat, I think that flattening mixed arrays and non-arrays is going to be pretty common as well. Not every array holds a single consistent type, and that's totally fine in JavaScript.

Jamesernator commented 7 years ago

@bakkot I understand what you mean by "I would never use this" isn't that compelling of an argument, but I showed instead an example of what I would find useful which is mutually exclusive with IsConcatSpreadable.

My problem with using IsConcatSpreadable is it'd hurt my (and I'm sure other people's too) use cases for .flatMap. I already use the iterable version of flatMap as a function but I find it a bit of a pain breaking up the method chains with arbitrary intermediate variable names which is why I'd like to see it be an official method of arrays.

It's a bit of a pain for everyone that the different use cases for .flatMap/.flatten mean one set of use cases is necessarily going to be more difficult (and probably will lead to bugs), obviously developers are going to be biased towards their use cases so they'll tend to prefer the option that helps their use cases.

All this does make me wonder is what use cases this proposal should be aiming to tackle and which ones are less important and whether or not .flatten/.flatMap even need to use the same flattening rule.

ljharb commented 7 years ago

Between "concat spreadable" and "iterable", what do the majority of Array.prototype methods do? Whatever that is, that's what these Array.prototype methods should probably do (spoiler: it's concat spreadable).

However, Array.from takes an arraylike or an iterable; what if we had these two concat spreadable methods, but also Array.flatten/Array.flatMap that similarly takes an arraylike or an iterable?

bakkot commented 7 years ago

@Jamesernator, I guess I don't really understand your use case. Why is what you've written any better than

function search(files, string) {
  return files.flatMap(file => {
    let indices = [];
    for (let i = 0 ; i < fileLength(file) ; i++) {
      if (readChunk(file, i, string.length) === string) {
        indices.push( { file, index: i } );
      }
    }
    return indices;
  });
}

search(['foo.txt', 'bar.text'], ['fizzbuzz'])

?

This array-returning version seems strictly easier to reason about, given that flatMap unconditionally evaluates the entire iterator and builds an array from it. And, of course, if you already have an generator g, you can files.flatMap(file => [...g(file)]).

But if flatMap takes an iterable, there's no simple to use that to solve the concatSpreadable use case, that I can see. It will end up spreading your strings - which I think would be surprising to the vast majority of users.


I agree that a flatMap for iterators would be a useful thing, but I don't think it should be Array.prototype.flatMap.

michaelficarra commented 7 years ago

I agree that a flatMap for iterators would be a useful thing, but I don't think it should be Array.prototype.flatMap.

This is the killer. @Jamesernator Remember that we're specifying Array.prototype.flatMap and not %IteratorPrototype%.flatMap. The only reason we aren't restricting this to just arrays is that we have precedent of some (isConcatSpreadable) values being "close enough" for the other Array prototype methods.

bterlson commented 7 years ago

Let's go with isConcatSpreadable for now.