jashkenas / coffeescript

Unfancy JavaScript
https://coffeescript.org/
MIT License
16.49k stars 1.99k forks source link

Proposal: Lazy comprehensions #5379

Open FelipeSharkao opened 2 years ago

FelipeSharkao commented 2 years ago

Create a syntax (I'm thinking in for*) for allowing comprehensions to produce an Iterator instead of a list, similar to what python does. This would help when the goal is not to map a list, but use the result in another iteration.

Proposed Syntax

somethingExpensive item for* item in list

Normal Behavior

(function() {
  var i, len, results;
  results = [];
  for (i = 0, len = list.length; i < len; i++) {
    item = list[i];
    results.push(somethingExpensive(item));
  }
  return results;
})();

Proposed Behavior

(function*() {
  var i, len;
  for (i = 0, len = list.length; i < len; i++) {
    item = list[i];
    yield somethingExpensive(item);
  }
  return results;
})();

I'm planning on implementing that (as it does not seem hard, and you'd be a nice introduction to CoffeeScript architecture) so I wanted to know if it's a welcomed feature.

edemaine commented 2 years ago

This seems like a cool idea to me. I'm not totally sure for* is particularly clear — it matches function* but that's not CoffeeScript notation...

While thinking about alternatives, I wondered about yield something for item in list, but that has a meaning now; in fact, you can build generator comprehensions in CS now like so:

do -> yield somethingExpensive item for item in list; null

The final ; null is optional, but it saves from building up a useless array of undefineds. The do -> is the interesting bit, because it isolates the yield to a created IFFE instead of yielding in the parent scope.

I still think shorthand for this could be nice.

FelipeSharkao commented 2 years ago

I can't think of a good syntax that builds upon previous iterator syntax (yield and from), while being clear and readable. If we make an exception and drop one of the two, I can think of using yield after for (not very readable), or using each after for (readable, but not very clear what it does).

somethingExpensive item for yield item in list
somethingExpensive item for each item in list

We can't use yield for (that would be more readable) since it has meaning when the body of the for is after it.

yield for item in list
  somethingExpensive item

# Same as
arr = for item in list
  somethingExpensive item

yield arr

each is readable, but fails to communicate easily it's use (I don't see that as a problem, since it would require a simple explanation to clear that out, and an explanation is already required, so people can know this syntax exists in the first place).

GeoffreyBooth commented 2 years ago

So is this basically the inverse of for…from? https://coffeescript.org/#generator-iteration. That iterates over generators, whereas this creates a generator?

In other words, this package but built into the language? So this is achievable today via that package wrapping the comprehension, e.g. arraygen(somethingExpensive item for item in list)? Or is there a meaningful difference?

I have to say, it feels a little weird to me. Do any of the Array prototype functions like map, forEach etc. return a generator? Since arrays themselves are already iteratable, how is returning a generator function different?

FelipeSharkao commented 2 years ago

So this is achievable today via that package wrapping the comprehension, e.g. arraygen(somethingExpensive item for item in list)

No, it doesn't. Wrapping the comprehension that way, you'll iterate over the list once, and then reiterate on the arraygen. Generators are useful when you want to spread the computational overload of the operation (it is not executed it's declared, hence 'lazy').

Since arrays themselves are already iteratable, how is returning a generator function different?

Yeah, creating an iterator that returns the list (item for each item in list or arraygen(list)) is kinda useless, but while Array.prototype[@@iterator] returns only an iteration over the items, a lazy comprehension allows you to "pipe" operations to it before doing any work, and then do the work on demand.

It is also useful for Functional Programming, since it is more declarative than imperative, hence it is common place in functional languages like Haskell.

rdeforest commented 2 years ago

I'd like to build on what @edemaine said.

Where one might currently write (using an example from your PR):

lazyMap    = (fn  ) -> (generator) -> yield fn item for item from generator               ; null
lazyFilter = (pred) -> (generator) -> yield    item for item from generator when pred item; null

squareItems = lazyMap    (a) -> a * a
oddItems    = lazyFilter (a) -> a & 1

results     = squareItems oddItems [1 .. 3]

Your patch allows this to be written as

results = a * a for a from [1 .. 3] when a & 1

Yes?

edemaine commented 2 years ago

I think you mean the following (need to wrap in parens, like all comprehensions in CoffeeScript, or else it's a one-line for expression; and need to use new notation):

results = (a * a for* a from [1 .. 3] when a & 1)

Idly, I was wondering whether another type of brackets might make sense (similar to how Python uses () for generator comprehensions and [] for list comprehensions). For example, we might write the generator as follows:

results = <a * a for a from [1 .. 3] when a & 1>

Hmm, angle brackets conflict with JSX. Never mind.

zsakowitz commented 2 years ago

What about brackets or curly braces? Currently, this is invalid code (I hope)

results = {a * a for a from [1..3]}

but this looks better and isn't used as much:

results = [a * a for a from [1..3]]

Python uses bracketed syntax for arrays and parenthesized syntax for generators, but CS users already rely on parenthesized syntax, so we'd have to use bracket or curly braces if we go for the wrapping method.

edemaine commented 2 years ago

The brace notation does seem promising. I don't currently see any ambiguities, as for isn't allowed on the key side of an object literal. (If there were a colon there, it would have an existing meaning.)

We can't follow Python's notation because it confuses generators in arrays with array interpolations. Python has the unfortunate feature that [(...)] is different from [...] (when ... has for in it).