A `yield` construct in the vein of C#

rust-highfive commented 10 years ago

Issue by bstrie Friday Jul 12, 2013 at 13:46 GMT

For earlier discussion, see https://github.com/rust-lang/rust/issues/7746

This issue was labelled with: A-an-interesting-project, E-hard, I-wishlist in the Rust repository

@thestinger has mentioned this several times as a wishlist item. This is a feature that greatly simplifies the implementation of external iterators by allowing them to be compiled to state machines at compile time (did I get that right?). See http://msdn.microsoft.com/en-us/library/vstudio/9k7k7cf0.aspx for reference.

While we shouldn't expect this for 1.0, it might be prescient to reserve the keyword right now.

Nominating for far-future milestone.

errordeveloper commented 10 years ago

I believe yield conveys pretty much the same meaning in other languages, including Python, Ruby and Lua. I think it's very appropriate to remove "C#" from the title.

thestinger commented 10 years ago

@errordeveloper: It's not implemented with a compile-time state machine in Python and Ruby. C# in the title is definitely appropriate, because (AFAIK) there isn't another language with a comparable feature. The proposal is not about generators / coroutines built on context switches.

errordeveloper commented 10 years ago

@thestinger sure, I didn't fully realise that.

sfackler commented 10 years ago

A decently in depth discussion of yield support in C++: http://isocpp.org/files/papers/N4134.pdf

errordeveloper commented 10 years ago

@sfackler great find!

errordeveloper commented 10 years ago

If Am I understanding correct that someone needs to make a pull-request with an actual RFC write-up, and if so, has anyone here already started on that or at least intending to do so?

sinistersnare commented 10 years ago

@errordeveloper https://github.com/rust-lang/rfcs/pull/53

zwarich commented 10 years ago

@errordeveloper I have started thinking about the implications of a 'yield' construct for borrow checking, which is something that past proposals have sort of glossed over. I haven't been too anxious to write up an RFC, because there is no chance that it will be considered before 1.0.

Lucretiel commented 10 years ago

Is the fact that it's implemented as a compile-time state machine significant? The usage pattern to the developer is the same as in any other language, albeit with explicit typing.

errordeveloper commented 9 years ago

It is, as it makes this feature most useful. State machines generated at build time can be validated/safety-checked as well as optimised by the compiler plugin that implements it.

On Wed, 19 Nov 2014 20:46 Lucretiel notifications@github.com wrote:

Is the fact that it's implemented as a compile-time state machine significant? The usage pattern to the developer is the same as in any other language, albeit with explicit typing.

— Reply to this email directly or view it on GitHub https://github.com/rust-lang/rfcs/issues/388#issuecomment-63710381.

Ericson2314 commented 9 years ago

That is an interesting bunch of libraries there! It looks like your monad macro has bind as a primitive like Haskell's. I wonder if performance could be improved by using map and join alone as primitives to remove some "administrative lambdas".

Either way (as pipes will contain FnOnces even if the core monad parts don't), it would be nice if there was some way to add an #[inline] to the function the once closure sugar desguars into. Maybe that might help with code locality.

gnzlbg commented 9 years ago

Here the latest Kohlhoff paper "Resumable Expressions", a new lower-level c++ alternative to await that allows implementing yield at the library level and combines stackless and stackfull coroutines: http://open-std.org/JTC1/SC22/WG21/docs/papers/2015/n4453.pdf

Lucretiel commented 9 years ago

I can't claim to be much of a fan of this proposal. He completely ignores what I think is the major advantage of the await/async (in python: yield from/coroutine) model: explicit control flow shifts. One of the major side benefits to single-threaded asynchronous design, as opposed to multithreaded synchronous, is that the control flow is only ever in one place at once. This means that a lot of the state-sync and atomicity issues in multithreaded design can be avoided. Compounding this advantage is the fact that, with await/yield from, all context shifts are explicit. In effect this means the developer can guarantee that between this await and that one, no other coroutine will touch any shared state.

As an example, I used this to my advantage when writing an IRC-like chat room using python asyncio, where I had the list of logged in users as a dict mapping user ids to connection objects. Because I could guarantee exclusive access to this object in a given coroutine' between yield froms, I didn't have to worry about locking it, creating a local copy of it, or anything like that.

Unfortunately, in the proposed resumable model, this advantage is lost. Any function could conceivably result in a context switch. The situation is slightly improved by the fact that such functions are explicitly tagged "resumable", but because merely calling the function can cause it to begin running or suspending, there's no way to separate the act of instantiating the function with that of stepping through it.

steveklabnik commented 9 years ago

Worth noting: Python's PEP process just accepted an async/await PEP: https://www.python.org/dev/peps/pep-0492/

ryanhiebert commented 9 years ago

PEP 492 is pretty cool, I'm excited to try the next pre-release of Python 3.5.

In addition to generators and async functions in Javascript, there's also some work being done on async generators as well: https://github.com/jhusain/asyncgenerator. The author gave a talk about some related concepts in this talk, though the syntax is hidden on the last slide and not really explained: https://www.youtube.com/watch?v=gawmdhCNy-A.

gnzlbg commented 9 years ago

@Lucretiel i find that being able to reuse existing functionsas is is a huge advantage of the "resumable expressions" proposal. The other proposals split the world and that is in my opinion a very bad thing.

Lucretiel commented 9 years ago

The trouble is that you then lose what I think is one of the primary benefits of async programming: explicit context switch points. When literally any function can cause a context switch, you're basically not any better than mutlithreading- constantly having to use locks and other primitives to ensure all your state is safe. God help you if you get it wrong and have to try to debug it. With explict async, you know that no other coroutine will execute between one yield and the next, so you're safe to do whatever needs to be done, and be guaranteed that it will appear atomic to other coroutines.

I'll grant that being able to reuse everything has some surface appeal, but I think it's a problem waiting to happen. The reality is that concurrency requires a fundamentally different model of thinking- look at all the classic multithreading problems to see why- and explicit async reflects that model.

Just my two cents, though. It could be that rust, with its emphasis on static memory safety, doesn't need to worry as much as, say, C++.

jtolio commented 9 years ago

Just to provide a counterpoint, introducing async/await keywords is (IMO) a growing language smell. It means that you're effectively partitioning all code into two competing classes (blocking/async), and that really causes some significant problems for large codebases. Please see http://journal.stuffwithstuff.com/2015/02/01/what-color-is-your-function/ for more.

gnzlbg commented 9 years ago

Just to provide a counterpoint, introducing async/await keywords is (IMO) a growing language smell. It means that you're effectively partitioning all code into two competing classes (blocking/async)

Yes, that basically splits the world.

On Thu, Sep 17, 2015 at 7:17 AM, JT Olds notifications@github.com wrote:

Just to provide a counterpoint, introducing async/await keywords is (IMO) a growing language smell. It means that you're effectively partitioning all code into two competing classes (blocking/async), and that really causes some significant problems for large codebases. Please see http://journal.stuffwithstuff.com/2015/02/01/what-color-is-your-function/ for more.

— Reply to this email directly or view it on GitHub https://github.com/rust-lang/rfcs/issues/388#issuecomment-140972102.

ryanhiebert commented 9 years ago

Just to provide a counterpoint, introducing async/await keywords is (IMO) a growing language smell. It means that you're effectively partitioning all code into two competing classes (blocking/async)

I completely understand your point of view, having debated it a great deal myself, but I have to disagree. It is a totally different model with different semantics, and you cannot reasonably pretend to do async work synchronously and have be something that a mediocre programmer, such as myself, can reason about.

While a simple function is just a function, an async function isn't a function at all. The flow is like a function, but the async and await keywords allow us to reason about what race conditions may happen in this function in a sane way, because we know where suspension will happen. The similarity to a function is on purpose, but they aren't the same thing, and for very good reason.

The fact that the later we have explicit async and await the more we will have libraries that end up with two versions of different functions, one synchronous and one asynchronous, is an indication that it's important to solve this sooner rather than later, so that library authors can have the tools they need to make that decision sooner rather than later. Then maybe they can choose to have only one version, whichever is appropriate for their work.

Because of Rust's memory safety guarantees, it's easy to conclude that we don't need explicit suspension points, and that Rust's ownership rules will sort out this mess, as it does for memory access. But there are plenty of race conditions that aren't memory-access data races, and programmers need as much help with those as they do with memory races.

So far, I haven't seen how Rust's ownership could make that part of explicit async programming unnecessary. If somebody can solve that, then I'd probably consider further whether explicit async and await keywords were really necessary. For now, I'm convinced that they are.

jtolio commented 9 years ago

While a simple function is just a function, an async function isn't a function at all. The flow is like a function, but the async and await keywords allow us to reason about what race conditions may happen in this function in a sane way, because we know where suspension will happen. The similarity to a function is on purpose, but they aren't the same thing, and for very good reason.

Because of Rust's memory safety guarantees, it's easy to conclude that we don't need explicit suspension points, and that Rust's ownership rules will sort out this mess, as it does for memory access. But there are plenty of race conditions that aren't memory-access data races, and programmers need as much help with those as they do with memory races.

This seems like a surprising position to take. This doesn't strike me as a very good reason, but more akin to rationalization. As long as there's other threads running, you definitely don't have explicit suspension points. Your argument against yourself here is a good one. Rust is designed to make explicit suspension points unnecessary, which is a good thing. Single-threaded event loops don't make good use of multi-core parallelism.

The fact that the later we have explicit async and await the more we will have libraries that end up with two versions of different functions, one synchronous and one asynchronous, is an indication that it's important to solve this sooner rather than later, so that library authors can have the tools they need to make that decision sooner rather than later. Then maybe they can choose to have only one version, whichever is appropriate for their work.

Again, a surprising position to take. If the language handles async IO for you, then every library always works for you, and there's no bifurcation. Imagine the following scenario: you're writing some small app at low scale. It's a killer app, it does well, and you need to scale it up. Bam, you need to switch to an async IO model (as indicated by https://news.ycombinator.com/item?id=10232131). Well shoot, that sucks. Okay, so you start porting everything, but you start hitting the long-tail of libraries. There's only one Rust UPNP library perhaps, and its author didn't need to scale like you do, and it blocks. Great, so now you either get to write a new library or make a threadpool. But it's not just the UPNP library, there's 20 other little libraries like this. You switch to the async version of some database library and everything starts looking good. A few weeks later you notice your app just keeps blocking up and no longer responding. Turns out one small error case in the database library uses the blocking version of the library and now your entire event loop is hosed.

This entire problem could be avoided completely.

sinistersnare commented 9 years ago

Somewhat relevant article

glaebhoerl commented 9 years ago

Another somewhat relevant article (and presentation)

rpjohnst commented 9 years ago

Implementation-wise, it is not hard to convert an async function back to a blocking one with little to no overhead. It may be too late to do anything like this in Rust, but it would be interesting to make everything async, and then let typical blocking calls use some syntactically lightweight (perhaps even invisible) way to force them to be synchronous.

This would give people the benefits of Go-like concurrency, without taking away the control and flexibility of async/await.

Ericson2314 commented 9 years ago

Heh, parts of this rant have been a long time coming :)

To those that want to combine everything, I am sympathetic, but it's very important to remember that these different implementation strategies have different tradeoffs, and that matter for a language like Rust. Much talk of continuation, threads, etc misses out that. For example, we can use:

Dynamically dispatched closures: @freebroccolo's beautiful libraries mentioned early on in this thread
Statically dispatched closures: this proposal
Threads: https://github.com/RustOS-Fork-Holding-Ground/libfringe/blob/params/tests/params.rs ("green" in my case, but I don't doubt with enough work something similar could be done for native threads)

The solution is not to try to paper over these distinction, but provide ways to be agnostic/polymorphic to which is used. For a first step, it's good to make higher-order functions agnostic in their arguments. For example a suspended thread in my forked libfringe can be treated as a normal closure if the current continuation is prepended as the first argument. Even better however, would to write polymorphic code that itself can be specialized to one of those types. We can go either the monadic or the algebraic affects route for this.

For those that care, I'd argue the root of the problem is that Rust (and C) implicitly manipulate the call stack, unlike a hypothetical typed assembly language. In such a typed assembly language, a contiguous call stack, a segmented call stack, webs of dynamically dispatched closures, finite state machines, etc, can all be on equal footing. Then everything we want here could be left to libraries and not the compiler, and all would be well.

It's infeasible to ditch LLVM and get really fancy dependent types in the at-all-near future (let me not suggest otherwise), so we will need compiler extensions/plugins. But I think with enough cleverness, we can integrate that with langauge-level abstractions like monad trait / effect system. I have some hunches on how that might be done, but they're pretty WIP, and I've already said some odd stuff, so I'll call it quits here.

ryanhiebert commented 9 years ago

@rpjohnst:

Implementation-wise, it is not hard to convert an async function back to a blocking one with little to no overhead. It may be too late to do anything like this in Rust, but it would be interesting to make everything async, and then let typical blocking calls use some syntactically lightweight (perhaps even invisible) way to force them to be synchronous.

This ability to run async code from synchronous code is somewhat assumed. The split the world issue still exists, though, and this doesn't do anything to solve it, unfortunately. The link (shared twice above by different people) titled What Color is Your Function?, explains that this correspondence exists. Though it could be a neat language feature if we did end up with async / await keywords.

@jtolds:

As long as there's other threads running, you definitely don't have explicit suspension points. Your argument against yourself here is a good one. Rust is designed to make explicit suspension points unnecessary, which is a good thing.

Using threads implies implicit suspension points. That doesn't mean there aren't some explicit suspension points in there too, but you have to consider both complexities together when you do that, so it's something you likely wouldn't want to do without serious consideration.

Rust isn't designed to make explicit suspension points unnecessary AFAIK, it simply doesn't have the feature. A post-1.0 feature. Perhaps there's some future enhancement that would make it unnecessary, but I haven't heard of it yet.

Single-threaded event loops don't make good use of multi-core parallelism.

You are correct that single threaded event loops can't use multi-core parallelism. That's the whole point. And if you're using them, they don't need to, because the CPU isn't your bottleneck, IO is.

Again, a surprising position to take. If the language handles async IO for you, then every library always works for you, and there's no bifurcation.

bifurcation. And I learn a new word, thank you. :smiley:

It depends, of course, on what you mean by "handles it for you". Rust's type system can make things memory-safe, but there are other conditions that you have to solve yourself, because Rust does not solve it for you.

Consider the dual lock deadlock. In single-threaded explicit-suspension-points world, handling such a situation is trivial (python-ish pseudo-code, I'm overlooking some problems Rust saves you from to show you one it doesn't):

import get_big_thing  # A big IO bound task.

# Both locks are required simultaneously to safely run get_big_thing
mutex1, mutex2 = Mutex(), Mutex()

async def this_way():
    lock1 = mutex1.lock()
    lock2 = mutex2.lock()
    return (await get_big_thing())
    # Locks released at end of scope

async def that_way():
    lock2 = mutex2.lock()
    lock1 = mutex1.lock()
    return (await get_big_thing())
    # Locks released at end of scope

# Some run loop runs these two coroutines

The problem here is a dead-lock. If a synchronous version of this code is run on multiple threads to make it parallel, it would be easy for the the first lock to be obtained by this_way, then the second lock to be obtained by that_way, and then neither would be able to continue, because both of their locks are already taken.

Using a single thread to run these coroutines (because our bottleneck is IO, not CPU) works really well, because the explicit await points mean that the programmer can easily identify the places where something else might happen first.

It's important to note that the purpose of this is not to make it easy to convert synchronous code to async code. The purpose is to allow the author to reason about async code. The explicit suspension points help the author to do that.

This point is so important that Python chose not to implement the awaitable protocol on the futures module, which uses threads / processes to resolve them. The async / await model only really works well when you're dealing inside a single thread, but it can be invaluable in that context.

Imagine the following scenario [snip long demonstration of the long-tail effect]

Yes, the long-tail effect really exists. The longer we wait before providing the choice, the worse it gets, because there's more code that has to be considered retroactively. I expect this to be a big burden in the Python community going forward. They still added the feature.

Unless there's another alternative that can make these deadlocks and race conditions that the borrow checker does not solve easier to reason about, this solution does indeed help programmers reason about what will happen in their program. Unless there is a better solution to the problem, all we would do by waiting is exacerbate the problem.

jtolio commented 9 years ago

You are correct that single threaded event loops can't use multi-core parallelism. That's the whole point. And if you're using them, they don't need to, because the CPU isn't your bottleneck, IO is.

Not necessarily. Event loop/promise processing adds overhead, and there's a maximum amount of overhead a single core can handle, which limits the number of (possibly small) IO operations that can be handled. This isn't academic; the reason my company switched from Python+Twisted to Go was this reason. We were CPU bound when we should have been IO bound. That said, it was Python + Twisted, so I admit the overhead was unnecessarily large.

async def this_way():
    lock1 = mutex1.lock()
    lock2 = mutex2.lock()
    return (await get_big_thing())
    # Locks released at end of scope

In this example, I assume you mean to have lines like lock1 = await mutex1.lock() ? Surely you don't mean to use actual mutexes but instead mean async mutexes in your event loop code. Even with your description I admit I am baffled how async helps with deadlocks here. Either you mean real mutexes, in which case you have a blocked event loop waiting to happen and you didn't need the mutexes to begin with (cause there's no actual parallel multi-core contention), or you mean async mutexes, in which case you still get an application-level deadlock.

It's important to note that the purpose of this is not to make it easy to convert synchronous code to async code. The purpose is to allow the author to reason about async code. The explicit suspension points help the author to do that.

Not to belabor the point but so far I've only seen more confusion.

Yes, the long-tail effect really exists. The longer we wait before providing the choice, the worse it gets, because there's more code that has to be considered retroactively. I expect this to be a big burden in the Python community going forward. They still added the feature.

The initial release of Twisted was 12 years ago. I know they're finally adding deferreds and event loops to the standard lib, but there's been prior art and experience with this stuff for a long time.

Unless there's another alternative that can make these deadlocks and race conditions that the borrow checker does not solve easier to reason about...

I know this isn't news or anything, but it is impossible to solve all deadlocks, in a mathematical proof sort of way.

Anyway, I guess I should point out I understand the reasons why Rust decided to not do green threads and why that results in people discussing async IO, but it's just super disappointing. When something is so close to perfect you really care about those last few blemishes.

jtolio commented 9 years ago

Also, I want to point out that I totally give you that the async style does allow the programmer to see when she has full control of the thread of execution. That is a benefit. But everything is tradeoffs, and if I had to order my priorities that benefit would be very far down the priority list, certainly underneath being able to use all the third party libraries I need.

Additionally, I have seen subtle synchronization bugs introduced when someone has some ordered logic depending on complete control of the thread of execution, and then some later refactor comes in and turns a synchronous CPU-bound call into an async call, and then the synchronization logic isn't correspondingly updated. I think relying on having full control of the CPU is the wrong kind of thing to rely on, and personally I'd rather not fool people into thinking about it.

But yes, it is all about tradeoffs.

ryanhiebert commented 9 years ago

@jtolds

Event loop/promise processing adds overhead, and there's a maximum amount of overhead a single core can handle, which limits the number of (possibly small) IO operations that can be handled. This isn't academic; the reason my company switched from Python+Twisted to Go was this reason. We were CPU bound when we should have been IO bound.

Good point. One thought that I had was possibly using multiple threads run multiple loops, which can bring benefits of both if done with care. But yes, there are definitely good reasons to use threads (or perhaps processes), even with IO heavy workloads and explicit async loops, and extra care has to be taken in those cases.

In this example, I assume you mean to have lines like lock1 = await mutex1.lock() ?

Facepalm. While I did not mistype, you are certainly correct that these would have to be async mutexes. I was trying to be clever, but ended up demonstrating most aptly how splitting the world can be confusing.

I know this isn't news or anything, but it is impossible to solve all deadlocks, in a mathematical proof sort of way.

Also, I want to point out that I totally give you that the async style does allow the programmer to see when she has full control of the thread of execution. That is a benefit.

But yes, it is all about tradeoffs.

True, we're discussing which trade-offs are best to aid the programmer in reasoning about their code. Explicit suspension points are one way to do that, and they have been effective in doing so in several languages already. However, that doesn't mean that they are right for Rust, and that the trade-off would be worth it in Rust code.

As I aptly demonstrated, explicit suspension points, while helpful, still have to manage a great deal of synchronization complexity. Having those be explicitly defined doesn't change that complexity, but may perhaps limit the areas that you have to look for it.

In a language like Rust, with it's emphasis on type safety and safe abstractions, the added benefits of making the suspension points explicit may not be worth splitting the world for, even though it often is in languages where the programmer is left more to their own devices (such as Python). Thank you for helping me understand that.

Anyway, I guess I should point out I understand the reasons why Rust decided to not do green threads and why that results in people discussing async IO, but it's just super disappointing.

I think that I understand the reasons too, but I'd be interested in reading more about that if you have some links to point to.

When something is so close to perfect you really care about those last few blemishes.

Indeed. Thank you for that care, and for the experience that you bring to the conversation. I've found it most valuable.

ryanhiebert commented 9 years ago

Back to the original topic of generators, is there any work done figuring out how the ownership rules might work? Generators are a great tool for making iterators easily, and I'd love to push the work forward.

rpjohnst commented 9 years ago

When something is so close to perfect you really care about those last few blemishes.

Giving the programmer explicit control over allocation and synchronization in a language like Rust is not a blemish, even if it "splits the world." Rust already has several splits that could be unified if we were willing to take away that control.

jtolio commented 9 years ago

First off, I just want to pause and say that @ryanhiebert is truly setting the example of how to have a discussion on the internet. What a gracious and kind guy! Thank you Ryan, talking has truly been a pleasure.

I think that I understand the reasons too, but I'd be interested in reading more about that if you have some links to point to.

I actually thought I understood better than I did until Steve Klabnik pointed out they tried a pluggable fiber system on this thread: https://lobste.rs/s/cvdbal/asynchronous_io_in_rust/comments/mpyfhf#c_mpyfhf He suggested I read https://github.com/rust-lang/rfcs/blob/master/text/0230-remove-runtime.md, which I haven't done yet, but I suspect will be the best place to go check out.

steveklabnik commented 9 years ago

Yup, you beat me to it!

pmj commented 9 years ago

I realise this might be little more than a "me too"/+1 comment, but the type of issue addressed by this RFC is something I deal with frequently. My main interest in Rust is as a replacement for C & C++ when writing kernel code. So far, I've not really made it beyond hello world with Rust in a kernel, but I'm slowly getting there with making it do real things. (Tangent: the main problem I seem to keep running into is that lots of useful functionality is in large libraries which have dependencies I can't fulfil in the parts of the library I'm not interested in.)

OS kernel code tends to be callback-heavy; C and pre-lambda C++ style callbacks are extra confusing, but even lambda/closure syntax only moves the code around a little bit, it doesn't make the control flow intent any clearer. (e.g. what would be a loop in blocking I/O looks nothing like it in callback-style async) Although thread stacks are tiny compared to userspace, they use non-pageable/wired memory, so idle threads consume more than just trivial resources. Meanwhile, simply creating a new thread isn't possible just anywhere. Lots of areas of code have strict rules about not blocking, and any (reliable) memory allocation might block, including a thread allocation. So you usually end up pre-allocating a context struct that tracks all the necessary state for an operation across callbacks, and pass that around. Compiler support for extracting that state struct and turning suspension points in (much clearer) blocking-style code into the appropriate callback function pointers would be amazing, assuming the memory allocation semantics can be controlled.

I suspect embedded developers might have similar requirements.

In summary, I don't care too much about the syntax of such a feature; I've been using C & C++ for the last 15 years, so it can't really get any worse. What I do care about, and what I hope might be taken into account when designing the feature, is that it:

Doesn't mess with the stack.
Doesn't depend on a large amount of library baggage to use productively.
Gives the programmer control over memory allocation.
Can be turned into C-compatible callback function pointers in some way where necessary.

(I've got an I/O-heavy kernel extension side project that I'm hoping to port to Rust & open source when it's done, which I'd be happy to use/provide as a testbed/demo for this sort of thing.)

toyotabedzrock commented 9 years ago

Relevant article from Facebook about a template they made to simplify the use of async to hand off work that would normally block the thread. https://code.facebook.com/posts/1661982097368498/futures-for-c-11-at-facebook/

ticki commented 8 years ago

What's the state of this?

tbu- commented 8 years ago

It's a feature that would be nice to have. There's no specific plan or RFC about this yet.

erickt commented 8 years ago

I've started prototyping this out in a plugin in https://github.com/erickt/stateful. It currently somewhat supports conditionals and loops, but not yet mutable variables :)

ryanhiebert commented 8 years ago

@erickt : Looks pretty cool, I'm glad to see that there's work happening on this. I'll be very interested to see how it evolves, and if there's any discussions on that repo, I'm following it, and am interested in participating. I'm not sure if I'll find time to help more than discussions, but I'll certainly participate in those.

burdges commented 8 years ago

Very interesting libraries by @freebroccolo ! And links by others. I'm sure folks here know about these, but..

There are a variety of interesting libraries employing mio, including mioco, which uses coroutines, and rotor, which makes specifying explicit state machines nicer.

In the abstract, I'm partial to the rotor approach of trying to make it pleasant to explicitly specify the state machine. It should be easier to debug, write tests for, quickcheck, etc., although obviously depends upon many factors. It'd be really cool if this stuff eventually coalesced into a sensible way to intermix explicit state machines and more implicit ones like coroutines.

mikeyhew commented 8 years ago

@pmj most of my programming experience, especially with asynchronous patterns and io, has been web programming in dynamic languages like javascript, python and ruby; and I never realized that systems programming is also callback heavy. So thank you for joining the discussion.

Doesn't mess with the stack.

Gives the programmer control over memory allocation.

Can you please explain the above two points in more detail? I'm particularly curious to know what you mean by "messing with the stack".

pmj commented 8 years ago

@mikeyhew Well, systems programming tends to involve a fair amount of I/O.

At the OS kernel-to-peripheral device interface layer, you tend to have memory-mapped device memory & registers, DMA, and interrupts as your main communication methods. Interrupts are inherently asynchronous, and are typically routed to device drivers as callbacks (primary or secondary interrupt handlers), although there's normally not a clear 1:1 request:response ratio. Once you get to higher level interfaces, such as block device I/O, it's usually quite similar. Modern kernels also include a considerable amount of network protocol implementations, at both the relatively low level (Ethernet, IP, TCP) and the high level end of the spectrum (network-based file systems and block devices spring to mind).

Doesn't mess with the stack.

Gives the programmer control over memory allocation.

Can you please explain the above two points in more detail? I'm particularly curious to know what you mean by "messing with the stack".

Many common OS kernels use the stack pointer register ($esp/$rsp on x86/64) to identify the currently running thread/task/process. They do this by defining a thread layout convention, e.g. each thread gets 16KiB, aligned to 16KiB virtual memory addresses, with a struct containing various information for identification, scheduling, etc. at the end or beginning of that block, and the rest of the memory is used as the kernel-space stack for that thread. So if the currently running thread needs to be identified at any point, you just round the stack pointer up or down to the nearest 16KiB boundary, and look for the thread struct there. This also means that if you mess around with the stack pointer (setting it to outside those 16KiB for example) this will cause a whole load of things to go wrong.

Control over memory allocation - I'm specifically talking about heap memory here. Stack memory is in short supply anyway, so that's not much use. Kernel heap allocations are usually pooled. But if the pool for the requested size runs out, the allocator will need to find a spare physical memory page, map it into the kernel virtual address space, and return an address within that. Spare physical pages will also run out of course, so often this means a trip to the pager, which steals a physical page from some process's pageable memory to fulfil the request. This is fine, but requires a lot of coordination and synchronisation, especially if multiple CPUs are involved. (Which these days is the norm, not the exception.) As a result, there are situations where allocations that invoke the pager are dangerous as they could cause a system deadlock.

For example, let's take a block device driver. The job of this driver is to accept I/O requests (disk reads, disk writes, etc.) from whoever the client is (file systems, etc.), pass them to the device in question, and when the device is done processing the request, notify the client that it has completed. If as part of this process the driver were to allocate heap memory, this could cause deadlock. Why? Most modern OSes will page memory to disk (swap file or volume). Let's say the system is low on memory, so the pager needs to save some memory pages to disk. It requests a write from the disk driver. The disk driver maintains a list of requests, and needs to add this request, so it allocates some heap memory for it. Turns out the pool is empty, so let's look for a spare page. Nope, system is low on memory, so it invokes the pager to evict some pages. The pager decides it needs to write some pages to disk, which means generating a disk write request. So it notifies the block device driver… etc. Boom!

So you have to write this type of code in a way that doesn't trigger paging. Which is fine, as long as your programming language makes it clear what code will or will not generate allocations. Some higher level language features will intrinsically require memory - which is absolutely fine, as long as the code can control the allocations. Many higher level languages will allocate heap memory implicitly, which rules them out entirely, but if allocation and use times can be split, there's no reason to not offer the language feature. So for example, if the device driver can determine how much memory will be required for each request, it can pre-allocate memory for a fixed number of requests.

I hope that explanation makes sense and gives you more of an idea why such an async feature would be interesting to system code, but what the extra requirements are to make it useful in that situation. In particular, if there's no way to tell how big an async continuation context will be until you actually try to create one, then that's too late for preallocating that memory.

DemiMarie commented 8 years ago

I think one requirement here is that the continuation context must be Sized – otherwise it cannot be preallocated.

As far as generating C-style function pointers: The best I can see is for the compiler to either generate an iterator (that's what C# does) or to generate something like an iterator, but heterogeneously typed. The user would need to convert it in to C function pointers manually.

@pmj Can you give an example of code that you would like to be able to write?

ahicks92 commented 8 years ago

What is the status of this? Also, should we close this in favor of #1081?

If not, I might be interested in working on it. I've got ideas, but this doesn't seem too hard. That said, I haven't read the Rust compiler at all or even done compiler work, so my viewpoint on how to do it is as a pure source transformation in a Rust-like language that allows unsafe gotos.

Rust likes iterators, as do I. But writing your own iterator is currently kind of a mess.

erickt commented 8 years ago

@camlorn: stateful is coming along quite nicely with the help of @NSil. We are getting pretty close to a 0.1 release, we just need to implement a few more things on dealing with temporaries across yield points. I'd love any help getting it ready :)

eddyb commented 8 years ago

@camlorn FWIW, an official implementation would have to be done on MIR to be optimal - where it should be easier, too, as the MIR is already a CFG.

erickt commented 8 years ago

@eddyb: I know you already know this, but for everyone else, it should be pretty straightforward to port Stateful's state machine transformations to Mir, since Stateful is heavily inspired by Mir. Really the tricky thing is really waiting on impl trait in order to implement this without boxing the iterator type.

eddyb commented 8 years ago

@erickt There are two reasons why a port is unnecessary/undesirable:

the "state machine transform" on the MIR is just changing Yield(next) to Return and pointing a branch of the entry point to next, no AST is involved at any point
generating the AdtDef for the state enum is tricky - it depends on when you do the state machine transform, if any optimizations are involved, then the layout of the type may depend on compiler options and any introspection of it would introduce hidden dependencies and compatibility hazards; if you do it too early, you risk being suboptimal (@alexcrichton ran into some perf issues with nested enums in futures-rs)

ahicks92 commented 8 years ago

How are recursive generators handled?

In languages that aren't Rust, the generator can make another instance of itself and then yield from it using a for loop. I think lifetimes can be used to prevent it, but maybe it's something we want to allow.

Python has a yield from, but I think they added it for I/O. Perhaps this isn't necessary. It might also be a good use for the become keyword which I read was reserved, but that could also be used for tail calls in future, so possibly not because of the double meaning. The implication being that you can "become" another generator.

@erickt I'll take a look. Not sure my Rust is good enough to help, but it might be.

eddyb commented 8 years ago

@camlorn Such recursion would become a type recursion and require boxing to avoid the state having an "infinite" size.

rust-lang / rfcs

A `yield` construct in the vein of C# #388