tc39 / proposal-joint-iteration

a TC39 proposal to synchronise the advancement of multiple iterators
https://tc39.es/proposal-joint-iteration
51 stars 2 forks source link

Positional and named variants should be separate methods #9

Closed bergus closed 2 months ago

bergus commented 7 months ago

I don't think it's a good idea to overload the zip method based on whether the first argument is iterable or not. This would make it a refactoring hazard to add a [Symbol.iterator] method to objects. The only precedence for such overloading is Array.from, but I think it is generally discouraged. I understand that most zip invocations would use an inline array literal or object literal where this would not be an issue, but it's still a concern for other cases, and the language should avoid establishing this as a pattern.

michaelficarra commented 7 months ago

refactoring hazard to add a [Symbol.dispose] method to objects

I'm going to assume you meant Symbol.iterator here, but please do correct me if I'm mis-understanding something.

Generally, I would agree with your point, but I think there's not actually a risk with this API. As you say, most usage will be with an array or object literal, so is moot, but also:

See similar discussion about a combined positional/named Promise.all API in https://github.com/tc39/proposal-await-dictionary/issues/13. It seems likely we will take that approach with that proposal, and I would prefer to match it.

I think the main difference between combined and separate APIs that we need to consider is instead a developer wanting to use the named variant with an object that is iterable. The developer will have no easy way to do that with a combined API, but could do so with separate ones. This, I think, is quite contrived. I don't know what kind of objects would be iterable and also have a bunch of properties that contain iterators. In these cases, I think it's okay to make the developer jump through hoops a bit.

brad4d commented 7 months ago

I and at least a couple of other people I work with ( @concavelenz, @shicks ), would prefer that the positional and named variants be separate not from concern for the suggested refactoring hazard, but for error checking and readability.

If you have separate Iterator.zipToArrays(iteratorOfIterators, options) and Iterator.zipToObjects(iteratorOfIterators, options) methods you get these benefits:

  1. The reader can always tell what shape the return value will have, even if the first argument is heavily obfuscated instead of being a simple array or object.
  2. Both methods can do additional error checking since they know for certain what the code author's intended input and output are.
towerofnix commented 7 months ago

In the handmade library for our project we have separately named functions stitchArrays and transposeArrays, which really perform the same work with different language. In practice, we use stitchArrays (zip to objects) almost everywhere and transposeArrays (zip to arrays) almost nowhere.

The obvious framing is that this is for consistency of code style: we prefer to work with named options rather than positional options because it makes it clear which value being zipped correpsonds to which value being accessed:

for (const {imagePath, thumbtacksNeeded, dimensions} of stitchArrays({
  imagePath: imagePaths,
  thumbtacksNeeded: imageThumbtacksNeeded,
  dimensions: imageDimensions,
})) { ... }

query.groupPreviousAlbum =
  stitchArrays({
    albums: groupAlbums,
    currentIndex: groupCurrentIndex,
  }).map(({albums, currentIndex}) =>
      atOffset(albums, currentIndex, +1));

The point of code style, in general, is to optimize legibility of processes. So the overwhelming usage of stitchArrays speaks to an argument for "absolute aesthetic consistency", which does have its advantages — mainly associating exactly one behavior with exactly one aesthetic, making it easier to recognize and reason about the behavior when you write or revisit it.

But it's got its limits, too. Like I mentioned, transposeArrays and stitchArrays really are performing the same behavior, they're just resulting in a shape that you process with different aesthetics. Consider those examples if we combine the two functions into just one, zip:

for (const {imagePath, thumbtacksNeeded, dimensions} of zip({
  imagePath: imagePaths,
  thumbtacksNeeded: imageThumbtacksNeeded,
  dimensions: imageDimensions,
})) { ... }

query.groupPreviousAlbum =
  zip([groupAlbums, groupCurrentIndex])
    .map(([albums, currentIndex]) =>
      atOffset(albums, currentIndex, +1));

The aesthetics are different - but the behavior isn't. In both cases, we're iterating over multiple arrays at once, and stitching the items into a value that is ready for destructuring. Differentiating stitchArrays and transposeArrays is superfluous, because the difference is only aesthetic. That's a good argument for keeping these as one function, IMO.

OTOH, I don't actually agree with the exact argument I'm making :)

I think the crucial part isn't combining them into a single function, but rather, ensuring there's clear and common terminology. I dislike how we use the terms "stitch" and "transpose" because those are different verbs and the only commonality is the (grammatical) object "arrays", which is the same object as in completely unrelated operations, e.g. "map arrays", "filter arrays", etc. Functions with an essential overlap in behavior should indicate as much through their names.

I'm in favor of zipToArrays and zipToObjects for @brad4d's reasons above. I'm not worried about objects with [Symbol.iterator] (and agree that treating it as any array in a combined zip would be fine), but the expressive power of zipToArrays and zipToObjects is crucial when we're explicitly designing functions for the sake of improving expression.

Although zipToArrays and zipToObjects have an overlap in "zipping" behavior, they really don't have an overlap in output behavior. That differentiation matters a lot when we're talking about low-level data operations! I want to recognize the particular phrasing in these names, also: "zip to arrays" and "zip to objects" means the difference is only what we're zipping to, not what we're zipping in the first place.

(In fact, you need to understand what the zip functions operate on in the first place, but once you do, you bring that knowledge to both functions, because they implicitly take the same inputs. Only the outputs are different in the names, so only the outputs are different in the behavior! Sorry if this seems obvious — it's just worth pointing out, because it's a sign of good naming convention, and that's clearly quite essential here.)

I feel that zip outputting both kinds, depending on the shape of its input, is a bit "magical." You might first learn what zip does when it's provided an array, then be baffled to see it receive an object and apparently provide objects now, too; while the difference isn't complicated (that's what makes any good primitive!), it's also not self-explanatory.

zipToArrays and zipToObjects are self-explanatory. And, like mentioned above, they're also reliable and more practically expressive when working with a non-trivial (or, heavens allow it, externally provided) input shape.

If a particular project benefits from the shorthand expression of a single zip function (it really is concise!) — it's fairly trivial to implement a helper function zip() {} that applies Iterator.zipToArrays or Iterator.zipToObjects depending on the presence of [Symbol.iterator] on its input.

towerofnix commented 7 months ago

Quick PS with another argument for separating zipToArrays and zipToObjects — although these don't tell you what the input needs to be, they do give you a cue as to what the output is going to be: an iterator of arrays, plural; an iterator of objects, plural. IMO that's a big boon for learning and memorizing the meaning of these functions, which might be used only sparingly in a given project (or part of a codebase) — linguistic cues are as good as any cue for recalling what any abstraction means!

michaelficarra commented 5 months ago

I've split the method into zipToArrays/zipToObjects as requested here. Any opinion on a zip/zipToObjects naming instead? The tupling behaviour is the "common" one in that it has precedent from other languages/libraries, but the object one should be recommended for any time 3 or more iterators/iterables are used, so it should be just as easy to reach for.

ljharb commented 5 months ago

Seems simpler to have the default assume arrays, ie, zip/zipToObjects

brad4d commented 5 months ago

I would personally probably choose the name that is more explicit about what the output shape is, zipToArrays, but I'm also OK with zip for that case.

towerofnix commented 5 months ago

I'm of the same opinion as @brad4d above - zipToArrays is my baseline preference, but I don't think we have to be strict about choosing the more explicit option, given two good choices.

I think there's also a case to be made for tupling being a fundamentally "more compact" operation, since you're imbuing positional information instead of name information, and destructuring is flexible while staying compact.

// concisely choosing whatever parameter name you want
Iterator.zip([clients, addresses])
  .map(([client, address]) => /* do something */)

// not so concisely choosing parameter names...
Iterator.zipToObject({clients, addresses})
  .map(({clients: client, addresses: address}) => /* do something */)

// or, not so concisely choosing names in the zip itself
Iterator.zipToObject({client: clients, address: addresses})
  .map(({client, address}) => /* do something */)

Of course, this is no argument against using zipToObject in general — only that it's inherently less concise to work with than tupling! So, I can see the reasoning for a more concise name for the generally more concise approach.

(Also sorry if I've misrepresented the actual syntax of these methods here at all, been a little while since we've checked them out!!)

bakkot commented 5 months ago

The thing about names of the fields of objects in zipToObject is a good point, and inclines me more towards having zip instead of zipToArray.

easrng commented 5 months ago

Using arrays rather than objects will also likely minify better.

michaelficarra commented 2 months ago

The methods are now separated. The committee chose the names zip and zipToObjects as a condition for advancement to Stage 2.7 at the June plenary.