albertodemichelis / squirrel

Official repository for the programming language Squirrel
http://www.squirrel-lang.org
MIT License
909 stars 155 forks source link

Feature Ideas: generator extensions #182

Open jayeheffernan opened 5 years ago

jayeheffernan commented 5 years ago

I think the utility of Squirrel generators could be greatly expanded with the addition of two features, outlined below. Thanks for your consideration.

Recently I've been using generators quite a lot, comparing them to similar features in other languages, and having a go at using them to implement "async/await"-like features in Squirrel. Async/await almost seems doable in Squirrel, since Async-Await ≈ Generators + Promises, we have generators, and we (can, by including a library,) have Promises. Unfortunately, there are a couple of key features missing from Squirrel generators, so that a usable general-purpose async/await-like method of flow control for Squirrel is off the table for now. Both of these features have precedent in other similar languages. I'm interested in hearing from the writer or any contributors if they would consider implementing any of these features.

Main Features

1: Passing values into a running generator

Squirrel passes the argument of a yield statement to the outside world as the return value from the resume statement. This yield-to-resume communication, allows the generator to pass information to calling code. Currently there is no obvious way to pass information back the other way. Data can flow in this direction by updating shared state, but the resulting code is brittle and ugly in a way that rules out use for many use cases, including a decent async/await implementation.

I suggest implementing resume-to-yield communication: the resume statement (or something similar) can be passed an extra argument, which then becomes the return value of the yield statement. This is in line with the functionality of generators in Javascript and Python, and coroutines in Lua. The existing use of the resume statement with one argument, a generator, must stay intact, but a new version can be added to take a second argument.

Here are some ideas for the syntax of such a feature, assuming a generator object gen being passed a new value val:

// Function-style
resume(gen, val)

// Method-style (like Javascript)
gen.resume(val)
// or
gen.next(val)

// Extended statement
resume gen : val
// or
resume gen with val

2: Throwing exceptions into a generator

The previous feature suggested a way to resume a generator by passing in data, this feature suggests resuming a generator with an exception. The idea is that calling code (code that is external to the generator, but maintains a reference to the generator object) can resume the generator and immediately raise an exception inside of it. In many cases, this will cause the generator to immediately end, but the exception can be synchronously caught from inside the generator with a try-catch around the last-executed yield line. This matches Javascript's generator.throw() and Python's [generator.throw()], though I can't find matching functionality for this one in Lua.

Ideas for new syntax, assuming a generator gen and an error message string msg:

// Method-style (like Javascript and Python)
gen.throw(msg)

// Extended statement
resume gen with error msg;
// or
resume gen throwing msg;
// or
throw msg into gen

Supplementary Features

3: Async-await

If there's interest in this, we could explore what more needs to be done to enable async-await syntax to be built in to the Squirrel language.

Others

Here are some other, smaller ideas for making Squirrel generators more useable.

4: yield from

Javascript and Python both have a syntax for delegating to another generator (or iterable) from inside a generator. Javascript has yield* gen and Python has yield from gen. A similar feature could be useful in Squirrel. For example:

// Yield integers up to `n`
function count(n) {
    for (local i = 0; i < n; i++) yield i;
}

// Yield integers up to `n`, `m` times
function countTwice(n, m) {
    for (local i = 0; i < m; i++) yield from count(n);
}

If added to Squirrel now, this would just be syntactic sugar for foreach (v in gen) yield gen, but it can get a little more powerful if the main features above are implemented. As explained in this StackOverflow answer (for Python), the delegation creates a bidirectional link between the inner generator and the calling code. So, for example, values passed back into the generator from the calling code (with the new resume-to-yield communication feature) would be available as the return value from yield statements inside the inner generator.

5: Checking if a generator is done

Until today, I've never had to manually check whether or not a generator was "dead". I couldn't see how I would be able to tell if the value returned from resume was null because of a yield null (in which case I should use it), or because of a return; (in which case I should drop it, and stop iterating the generator). I found the generator.getstatus() method, and my issue is solved. My issue is solved, but I find gen.getstatus() == "dead" just a little verbose and confusing for my needs. What about something like gen.done() or gen.isdone(), that just returns a boolean? Or, if you chose to implement gen.next() as a way of passing values in to generators, you could make the return value of .next() match Javascript's: a table with keys "done" and "value".

albertodemichelis commented 5 years ago

Aren't most of this scenarios already covered with squirrel threads?(see newthread() etc..) In my company back-end we use "async/await" like semantics for almost everything using squirrel threads, everything is already there. I agree about 'gen.getstatus() == "dead"' is kind of verbose and a bool function could be nice to have.

jayeheffernan commented 5 years ago

Aha, interesting question. For context, I am coming from writing Squirrel for the Electric Imp platform. Threads are not implemented there, so they are not an option, and I'm not very familiar with how they work. I'd be interesting to see some example code of threads being used similar to async/await.

They do seem closer to what I'm after. Passing values into threads is already supported by .wakeup(). The yield from behaviour of passing values into nested calls seems to be supported (?): "This allows a thread to suspend nested calls". I have now realised that this behaviour (both for passing values in, and for throwing errors in), is integral to a proper async/await implementation, so I would add it to the "main features" list for generators. It would seem that threads are a shorter path to something like async/await.

The key differences between threads and generators seem to be that "a thread has its own execution stack, global table and error handler", and that "This allows a thread to suspend nested calls" (which is the yield from behaviour, which I hope can be implemented for generators as well). Normally when using async/await, in Python or Javascript for example, it does not use its own execution stack. A threads-based implementation could still be reasonable, but this difference could cause unwanted differences and complications, so I think a generator-based implementation would be better (if possible).

Electric Imp say "the Electric Imp version of Squirrel does not currently support thread usage, for memory utilisation reasons and because the same effect can be achieved with less complexity using generators". We can't achieve quite the same effect, since thread.wakeup() takes a wakeup argument and resume doesn't, but I do agree with the comment on complexity. I personally find generators more intuitive, in function and in syntax. This is based on a background of mainly C, Python, and Javascript. In these languages I have used generators extensively, and traditional threads a bit, but never coroutines (although Python does have them).

So the reasons I think improving generators is worthwhile even if threads cover similar use cases are:

  1. Generators are simpler to understand (maybe, for some)
  2. Threads and generators already overlap in their functionality, so adding a new feature to a generator shouldn't be dismissed just because it already exists for threads. We have both, so why not improve on both (where appropriate)?
  3. If features are implemented for generators, they might make it downstream to the Electric Imp implementation where I can make use of them, which I'm not expecting to happen with threads any time soon
  4. In cases where either can be used, it seems better to use generators given reason #1, and given that the new execution stack and global table are not necessary
  5. async/await is normally implemented within the same caller stack, suggesting generators over threads

Regarding # 4, can you also comment on the extra memory usage of threads over generators? It seems like Squirrel threads are much more lightweight than traditional threads, but I'm wondering if there's still much overhead remaining.

Of course, making these changes won't make any difference to me personally unless they're picked up by Electric Imp, which could depend on how much new code needs to be added.

jayeheffernan commented 5 years ago

I want to add that I've just seen the recently added method thread.wakeupthrow(), which is what I was wanting for generators with my main feature request # 2.

I think implementing something like async/await with threads would definitely be worth trying, but that a generator-based implementation (if possible) would be a little better. I also think there's good reason to add these features to generators even if a threads-based async/await implementation works well enough. Let me know if you'd like me to elaborate on either of these points.

If you are considering adding these features for generators, my C++ is not good enough to help with the implementation but I'd like to offer help where I can (e.g. writing Squirrel code to to test out new features).