rust-lang / futures-rs

Zero-cost asynchronous programming in Rust
https://rust-lang.github.io/futures-rs/
Apache License 2.0
5.36k stars 620 forks source link

Best practices for combinators with their own task pools (a la FuturesUnordered) #2053

Open cramertj opened 4 years ago

cramertj commented 4 years ago

See https://github.com/rust-lang/futures-rs/issues/2047. When a combinator keeps its own task pool rather than deferring directly to the parent executor, cooperatively yielding (cx.waker.wake_by_ref(); return Poll::Pending) works "less well" because the task pool will remain active and thus be polled again. #2049 introduced a mitigation for this by limiting the number of times a FuturesUnordered is polled, but it'd be good to develop general guidance for these types of combinators, and perhaps a more principled solution for FuturesUnordered.

cc @jonhoo @carllerche

jonhoo commented 4 years ago

I think the most appealing solution to this is for FuturesUnordered to also cooperatively yield using the same mechanism that the underlying tasks are. #2049 is sort of a blunt solution since it just blindly yields every so often, and that has some sad implications (such as the multiplicative property @cramertj points out in #2047). If a mechanism like https://github.com/tokio-rs/tokio/pull/2160 was available throughout the ecosystem, then FuturesUnordered to simply use poll_cooperate (in the lingo of that PR) each time it detects that it has looped for example. I think the mechanism from that PR can be de-coupled from the executor by just providing hooks for the executor to bind into — see tokio::league::ceded — but it will certainly require some careful thought to be general enough.

cramertj commented 4 years ago

Yes, if there was some sort of global mechanism for setting budget and decrementing / resetting where appropriate, I could see FuturesUnordered making use of that.

jonhoo commented 4 years ago

One other option as briefly mentioned in https://github.com/rust-lang/futures-rs/issues/2047#issuecomment-577469813 is to make FuturesUnordered only ever iterate over its futures at most once. That is, if it ever detects that it is about to poll a given future for the second time in the same call to poll_next, then it returns Poll::Pending. How to do this efficiently given the current implementation is not entirely clear though.

udoprog commented 4 years ago

One other option as briefly mentioned in #2047 (comment) is to make FuturesUnordered only ever iterate over its futures at most once. That is, if it ever detects that it is about to poll a given future for the second time in the same call to poll_next, then it returns Poll::Pending. How to do this efficiently given the current implementation is not entirely clear though.

This is something I've been experimenting with for the last week in a project called unicycle. The architecture is described in more detail in the README, but briefly: It maintains two bitsets (active and alternate) and child tasks store wakeup interest in the active bitset. Once polled it swaps these bitsets and uses the alternate to determine what to poll while draining it. Once it is empty, it forcibly yields and starts over again.

This guarantees that each child task is only polled at most once per cycle. And child tasks which aggressively signal interest in waking up are not given any special preference.

What I've been able to determine so far:

With Jon's help I've been comparing it to FuturesUnordered in Noria's benchmarks. But all I can say so far is that it appears to provide the same throughput with a similar latency profile. I am however interested to know if anyone has any heavy, real world use of FuturesUnordered I can benchmark against. So please hit me up if you do!

dekellum commented 4 years ago

The benchmarks detailed here with links make heavy use of FuturesUnordered as a means to simulate concurrency and are throughput oriented. I suspect the latter body-image client benchmarks are potentially more interesting as they include net I/O.