async-rs / async-std

Async version of the Rust standard library
https://async.rs
Apache License 2.0
3.96k stars 341 forks source link

More flexible reactor/executor API #79

Open sdroege opened 5 years ago

sdroege commented 5 years ago

While this seems like something that shouldn't be considered for 1.0 IMHO, it would be good to start discussions about how an API could look like and what requirements different folks have here.

Also while this is kind of related to https://github.com/async-rs/async-std/issues/60, my main point here is about being able to have control over the lifetime of the reactor/executor, allowing to run multiple and about which would be used when/where. See also https://github.com/rustasync/runtime/issues/42 for a similar issue of mine for the runtime crate, on which everything that follows is based.


Currently the executor and reactor and thread pools are all global and lazily started when they're first needed, and there's no way to e.g. start them earlier, stop them at some point, run multiple separate ones, etc.

This simplifies the implementation a lot at this point (extremely clean and easy to follow code right now!) and is also potentially more performant than passing around state via thread-local-storage (like in e.g. tokio).

It however limits the usability at least in two scenarios where I'd like to make use of async-std.

Anyway, reasons why this would be useful to have (I'm going to call the reactor/executor/threadpool combination a runtime for the following):

  1. Usage in library crates without interfering with any other futures code other library crates or the application might use. This would potentially also go with specific per-thread configuration inside the library crate, for e.g. setting thread priorities of the runtime in a way that is meaningful for what this specific library is doing. (See also https://github.com/rustasync/runtime/issues/8)
  2. Similar to the above, but an extension with more requirements: plugins. For plugins you might want to use a runtime internally, but at some point you might want to be able to unload the plugin again. As Rust generally does static linking at this point, each plugin would have its own version of async-std/etc included, so unloading a plugin also requires to be able to shut down the runtime at a specific point and to ensure that none of the code of the plugin is running anymore.
  3. Error isolation. While this is probably done even better with separate processes, being able to compartmentalize the application into different parts that don't implicitly share any memory with each other could be useful, also for debuggability.
vertexclique commented 5 years ago

Truly, I am into this. I've mentioned this implicitly here:

https://github.com/async-rs/async-std/issues/14#issuecomment-521589882

... We should share a part of the thread pool explicitly for polling purposes to our own task mechanisms' management. If we don't do that writing abstractions will be cumbersome over time.

ghost commented 5 years ago

This is a feature we should and will implement, but we'll need to explore the design space first. I think the biggest blocker right now is improving our scheduler. Once the scheduler matures, it should become more obvious how to start/stop runtimes manually instead of always relying on a single static one.

I really want to do something radically different from Tokio here. In particular, instead of splitting the runtime into fully independent components like threadpool, reactor, timer, and so on, which can all be individually started/stopped, I'd like to try a more tightly-coupled design where lines between these components become blurry.

sdroege commented 5 years ago

I really want to do something radically different from Tokio here. In particular, instead of splitting the runtime into fully independent components like threadpool, reactor, timer, and so on, which can all be individually started/stopped, I'd like to try a more tightly-coupled design where lines between these components become blurry.

If you use the Runtime API in tokio then you get the whole bundle all together. There's API to run each of the components separately, but that's not really needed to expose IMHO.

What ideas do you have for the scheduler and how would that play into the design here?

Basically something like the API on the tokio Runtime type would be sufficient already, the tricky part is the passing of the runtime to everything that needs it.

rw commented 5 years ago

Now that rustasync/runtime has been deprecated and archived, does this issue have increased importance?

My use case is that, when running integration tests, I need to be able to plug in my custom runtime in place of Tokio.

ghost commented 5 years ago

How does everyone feel about an API like this?

use async_std::rt::Runtime;
use async_std::task;

// Create a new runtime with default settings.
let rt = Runtime::new();

rt.block_on(async {
    // Since we're inside `rt` now, tasks get spawned onto `rt`.
    task::spawn(async { ... });
});

// What happens here?
drop(rt);

Something I'm unsure about is what should happen when we drop the runtime instance. If there are threads executing tasks at the moment drop(rt) happens, do we block until those tasks are completed? Do we just let them go and signal the threadpool that it should shut down ASAP? What kinds of shutdown procedures do you need?

yoshuawuyts commented 5 years ago

My use case is that, when running integration tests, I need to be able to plug in my custom runtime in place of Tokio.

@rw What purpose does this custom runtime serve? Are you mocking out any APIs perhaps?

sdroege commented 5 years ago

How does everyone feel about an API like this?

That would work as the most minimal starting point, yes. Maybe also some kind of rt.spawn(fut) that does not block would be good to have, or to otherwise get a (also cloneable) handle to something that can spawn on this specific runtime.

What kinds of shutdown procedures do you need?

Personally it would be sufficient to shut down ASAP on drop. The active time of the runtime would be defined by the given future, and it should be possible to get any other shutdown behaviour by writing that future in a specific way.

You probably do not want to wait for all spawned tasks though: there might e.g. easily be interval timers (forever) or timeouts (far in the future) be scheduled that you don't really want to wait on.

sdroege commented 5 years ago

@stjepang Or maybe even have rt.block_on() consume the runtime by value so that it's clear that after it returned nothing else is going to happen anymore.

vorner commented 5 years ago

I've found that being able to reuse the runtime for multiple iterative block_ons is a nice optimisation (I've used that in tokio). Actually, tokio allows even concurrent block_ons from multiple threads.

As for the drop… there should be some way to wait for the RT to drain itself before dropping it. And I think that it should be the default, because oftentime leaving a thread that actively does something while the main thread shuts down does weird things and bugs. If one wants it to outlive the main, it would be fine to forget(rt) instead.

ghost commented 5 years ago

because oftentime leaving a thread that actively does something while the main thread shuts down does weird things and bugs.

What if we leave only idling threads that have no tasks to execute when the main thread shuts down?

vorner commented 5 years ago

I don't know. I mean, if nothing runs it might be fine ‒ at least from the Rust user point of view. But I've heard it might be problematic on Windows ‒ I don't know any details there, though.

If the thread is already synchronized to know it should stay idle, is it a problem to shut it down instead?

dignifiedquire commented 5 years ago

Adding this here as #137 points here. One blocker for adapting async-std in some places for me is that I have no control over the threadpools at all. The two minimum things I would need are a) setting a maximum thread count b) setting properties on the threads, like their nice'nes level. The way rayon solves this, works quite well in this regard for me.

skade commented 5 years ago

@dignifiedquire To my reading, that describes a separate threadpool library with async/.await integration though?

dignifiedquire commented 5 years ago

@skade while that would be great, that was not what I meant. The thing I need to be able to do is to control the amount of threads and their properties anything in my rust stack creates, including the executor/runtime.

I don't need access to threadpools directly, I just need to control them. For manual thread spawning I would still use a different solution.

This is hard enough with a bunch of libraries today already creating thread here and there how ever they seem fit, which is why I hope async-std can help in stopping that trend :)

ghost commented 5 years ago

@dignifiedquire Can you say more on how you want to set properties on threads and how Rayon solves the problem?

Is this the method you would be using in Rayon? https://docs.rs/rayon/1.2.0/rayon/struct.ThreadPoolBuilder.html#method.spawn_handler

You've also mentioned you need to control the number of threads. But I wonder what we should do in case of spawn_blocking(), where the number of threads is dynamic, and we essentially spawn an unlimited number of threads on demand?

dignifiedquire commented 5 years ago

This is roughly how I plan to use rayons capabilities: https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=38ed5e90ea0fff7ca5cea1b196118bf3 (+ using spawn_handler to set properties)

My expectation would be that any thread spawning would be limited to a max that I set, and as such it would block until there is a free thread to use/there is room to spawn another one

vavrusa commented 4 years ago

I would be interested in this as well. The 0.3 futures have a decent Spawn mechanism for this. Perhaps async-std/task could implement something like:

pub fn spawn_on<F, T, S>(spawner: S, future: F) -> JoinHandle<T> where
    F: Future<Output = T> + Send + 'static,
    T: Send + 'static,
    S: Spawn

The caller would be responsible for providing an appropriate implementation (e.g. if the task is blocking, a spawner that can handle blocking tasks).