Open sdroege opened 5 years ago
Truly, I am into this. I've mentioned this implicitly here:
https://github.com/async-rs/async-std/issues/14#issuecomment-521589882
... We should share a part of the thread pool explicitly for polling purposes to our own task mechanisms' management. If we don't do that writing abstractions will be cumbersome over time.
This is a feature we should and will implement, but we'll need to explore the design space first. I think the biggest blocker right now is improving our scheduler. Once the scheduler matures, it should become more obvious how to start/stop runtimes manually instead of always relying on a single static one.
I really want to do something radically different from Tokio here. In particular, instead of splitting the runtime into fully independent components like threadpool, reactor, timer, and so on, which can all be individually started/stopped, I'd like to try a more tightly-coupled design where lines between these components become blurry.
I really want to do something radically different from Tokio here. In particular, instead of splitting the runtime into fully independent components like threadpool, reactor, timer, and so on, which can all be individually started/stopped, I'd like to try a more tightly-coupled design where lines between these components become blurry.
If you use the Runtime
API in tokio then you get the whole bundle all together. There's API to run each of the components separately, but that's not really needed to expose IMHO.
What ideas do you have for the scheduler and how would that play into the design here?
Basically something like the API on the tokio Runtime
type would be sufficient already, the tricky part is the passing of the runtime to everything that needs it.
Now that rustasync/runtime has been deprecated and archived, does this issue have increased importance?
My use case is that, when running integration tests, I need to be able to plug in my custom runtime in place of Tokio.
How does everyone feel about an API like this?
use async_std::rt::Runtime;
use async_std::task;
// Create a new runtime with default settings.
let rt = Runtime::new();
rt.block_on(async {
// Since we're inside `rt` now, tasks get spawned onto `rt`.
task::spawn(async { ... });
});
// What happens here?
drop(rt);
Something I'm unsure about is what should happen when we drop the runtime instance. If there are threads executing tasks at the moment drop(rt)
happens, do we block until those tasks are completed? Do we just let them go and signal the threadpool that it should shut down ASAP? What kinds of shutdown procedures do you need?
My use case is that, when running integration tests, I need to be able to plug in my custom runtime in place of Tokio.
@rw What purpose does this custom runtime serve? Are you mocking out any APIs perhaps?
How does everyone feel about an API like this?
That would work as the most minimal starting point, yes. Maybe also some kind of rt.spawn(fut)
that does not block would be good to have, or to otherwise get a (also cloneable) handle to something that can spawn on this specific runtime.
What kinds of shutdown procedures do you need?
Personally it would be sufficient to shut down ASAP on drop
. The active time of the runtime would be defined by the given future, and it should be possible to get any other shutdown behaviour by writing that future in a specific way.
You probably do not want to wait for all spawned tasks though: there might e.g. easily be interval timers (forever) or timeouts (far in the future) be scheduled that you don't really want to wait on.
@stjepang Or maybe even have rt.block_on()
consume the runtime by value so that it's clear that after it returned nothing else is going to happen anymore.
I've found that being able to reuse the runtime for multiple iterative block_on
s is a nice optimisation (I've used that in tokio). Actually, tokio allows even concurrent block_on
s from multiple threads.
As for the drop… there should be some way to wait for the RT to drain itself before dropping it. And I think that it should be the default, because oftentime leaving a thread that actively does something while the main thread shuts down does weird things and bugs. If one wants it to outlive the main, it would be fine to forget(rt)
instead.
because oftentime leaving a thread that actively does something while the main thread shuts down does weird things and bugs.
What if we leave only idling threads that have no tasks to execute when the main thread shuts down?
I don't know. I mean, if nothing runs it might be fine ‒ at least from the Rust user point of view. But I've heard it might be problematic on Windows ‒ I don't know any details there, though.
If the thread is already synchronized to know it should stay idle, is it a problem to shut it down instead?
Adding this here as #137 points here. One blocker for adapting async-std in some places for me is that I have no control over the threadpools at all. The two minimum things I would need are a) setting a maximum thread count b) setting properties on the threads, like their nice'nes level. The way rayon solves this, works quite well in this regard for me.
@dignifiedquire To my reading, that describes a separate threadpool library with async/.await integration though?
@skade while that would be great, that was not what I meant. The thing I need to be able to do is to control the amount of threads and their properties anything in my rust stack creates, including the executor/runtime.
I don't need access to threadpools directly, I just need to control them. For manual thread spawning I would still use a different solution.
This is hard enough with a bunch of libraries today already creating thread here and there how ever they seem fit, which is why I hope async-std can help in stopping that trend :)
@dignifiedquire Can you say more on how you want to set properties on threads and how Rayon solves the problem?
Is this the method you would be using in Rayon? https://docs.rs/rayon/1.2.0/rayon/struct.ThreadPoolBuilder.html#method.spawn_handler
You've also mentioned you need to control the number of threads. But I wonder what we should do in case of spawn_blocking()
, where the number of threads is dynamic, and we essentially spawn an unlimited number of threads on demand?
This is roughly how I plan to use rayons capabilities: https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=38ed5e90ea0fff7ca5cea1b196118bf3 (+ using spawn_handler
to set properties)
My expectation would be that any thread spawning would be limited to a max that I set, and as such it would block until there is a free thread to use/there is room to spawn another one
I would be interested in this as well. The 0.3 futures have a decent Spawn mechanism for this. Perhaps async-std/task could implement something like:
pub fn spawn_on<F, T, S>(spawner: S, future: F) -> JoinHandle<T> where
F: Future<Output = T> + Send + 'static,
T: Send + 'static,
S: Spawn
The caller would be responsible for providing an appropriate implementation (e.g. if the task is blocking, a spawner that can handle blocking tasks).
While this seems like something that shouldn't be considered for 1.0 IMHO, it would be good to start discussions about how an API could look like and what requirements different folks have here.
Also while this is kind of related to https://github.com/async-rs/async-std/issues/60, my main point here is about being able to have control over the lifetime of the reactor/executor, allowing to run multiple and about which would be used when/where. See also https://github.com/rustasync/runtime/issues/42 for a similar issue of mine for the
runtime
crate, on which everything that follows is based.Currently the executor and reactor and thread pools are all global and lazily started when they're first needed, and there's no way to e.g. start them earlier, stop them at some point, run multiple separate ones, etc.
This simplifies the implementation a lot at this point (extremely clean and easy to follow code right now!) and is also potentially more performant than passing around state via thread-local-storage (like in e.g.
tokio
).It however limits the usability at least in two scenarios where I'd like to make use of
async-std
.Anyway, reasons why this would be useful to have (I'm going to call the reactor/executor/threadpool combination a runtime for the following):
async-std
/etc included, so unloading a plugin also requires to be able to shut down the runtime at a specific point and to ensure that none of the code of the plugin is running anymore.