sdroege / async-tungstenite

Async binding for Tungstenite, the Lightweight stream-based WebSocket implementation
MIT License
396 stars 61 forks source link

Task wakes too often in release mode when used with `async_std` #133

Open timotheyca opened 5 months ago

timotheyca commented 5 months ago

This may cause a CPU core to be constantly at 100% after a client connects and sends nothing:

#[async_std::main]
async fn main() -> async_tungstenite::tungstenite::Result<()> {
    let listener = async_std::net::TcpListener::bind("127.0.0.1:8080").await?;
    let (stream, _) = listener.accept().await?;
    async_tungstenite::accept_async(stream).await?;
    Ok(())
}

(the same also happens after the handshake when the client is idle; using only accept_async as the example, because shorter)

It could be an issue with async-std but I haven't yet been able to replicate it without async-tungstenite (both codebases are still a bit hard for me to navigate).

[dependencies]
async-std = { version = "1.12.0", features = ["attributes"] }
async-tungstenite = { version = "0.26.1", features = ["async-std-runtime"] }
sdroege commented 5 months ago

If you do the same with tokio, does it behave correctly?

timotheyca commented 5 months ago

I've tested it with async-net (versions 1 and 2), and that worked okay; will test with tokio now;

ed.: tokio is fine

timotheyca commented 5 months ago

This, to me, looks a lot like some sort of soft race condition, because it goes away if I add some overhead (or run the thing in debug) or do anything that causes poll be called not as soon as it wakes, and happens more often if I add this at the very start of main:

async_std::task::spawn(futures_util::future::pending::<()>());
timotheyca commented 5 months ago

Update: this seems to affect anything that uses async_io::Async::<std::net::TcpStream>::poll_read directly. Happens both on versions 1 and 2

timotheyca commented 5 months ago

I found the reason (async_tungstenite::compat):

let waker = match kind {
    ContextWaker::Read => task::waker_ref(&self.read_waker_proxy),
    ContextWaker::Write => task::waker_ref(&self.write_waker_proxy),
};

and the "fix" (for now I can't figure out much better):

let waker = match kind {
    ContextWaker::Read => task::waker_ref(&self.read_waker_proxy),
    ContextWaker::Write => task::waker_ref(&self.write_waker_proxy),
}
.clone();

why:

as for why async-net works: I have no clue

timotheyca commented 5 months ago

Okay, so I've added a print:

let waker = match kind {
    ContextWaker::Read => task::waker_ref(&self.read_waker_proxy),
    ContextWaker::Write => task::waker_ref(&self.write_waker_proxy),
};
eprintln!("{waker:?} {:?}", waker.clone());

In debug builds (some parts omitted):

... data: 0x563a8f431fa0, vtable: 0x563a8e701368 ... data: 0x563a8f431fa0, vtable: 0x563a8e701368 ...

In release builds on 1.78/1.79:

... data: 0x563ac7b8cf30, vtable: 0x563ac5b80238 ... data: 0x563ac7b8cf30, vtable: 0x563ac5b80178 ...

???????

compiler bug?

Seems like it's making the same vtable be at different addresses.

ed.: it was indeed a compiler bug: https://github.com/rust-lang/futures-rs/pull/2829#issuecomment-1962863389

timotheyca commented 5 months ago

There's been a change in how Waker works. std::task::Wake has adapted to it and futures::task::ArcWake has not. I've checked with both std::task::Wake and with a patched version of futures::task::ArcWake, both work fine. So, ig, if there's a need to mitigate this without waiting for futures-task to be changed, switching over std::task::Wake is a possibility at the cost of some extra atomic operations (std::task doesn't provide an equivalent to waker_ref, as far as I see).

sdroege commented 5 months ago

I don't have time to look into this in more detail until some time next week, but this sounds like a bug in the code doing the comparison. If multiple codegen units are involved, it's not guaranteed that the vtable is always the same for the same object.

That's why e.g. https://doc.rust-lang.org/stable/std/sync/struct.Arc.html#method.ptr_eq exists.

timotheyca commented 5 months ago

I'll list all the parts of code I found relevant so far in one place, and with less guesses

async-tungstenite creating wakers that change their vtable on clone: https://github.com/sdroege/async-tungstenite/blob/6ff28b380c55b59a6ca96a6d76528754fa7b50ff/src/compat.rs#L128-L129

async-std directly uses async_io::Async: https://github.com/async-rs/async-std/blob/8947d8e03c4c267fdd3828e07368cefc7d39b002/src/net/tcp/stream.rs#L5 https://github.com/async-rs/async-std/blob/8947d8e03c4c267fdd3828e07368cefc7d39b002/src/net/tcp/stream.rs#L50 https://github.com/async-rs/async-std/blob/8947d8e03c4c267fdd3828e07368cefc7d39b002/src/net/tcp/stream.rs#L308

async-io checking Wakers for equality and waking the old one on change: https://github.com/smol-rs/async-io/blob/master/src/reactor.rs#L456-L461

futures-task creates vtables for original and cloned (note no #[inline(always)] on clone_arc_raw): https://github.com/rust-lang/futures-rs/blob/master/futures-task/src/waker_ref.rs#L64 https://github.com/rust-lang/futures-rs/blob/master/futures-task/src/waker.rs#L38-L42

Patch to alloc::task in Rust that uses #[inline(always)] to avoid the split across CGUs: https://github.com/rust-lang/rust/pull/121622

async-tungstenite can mitigate this by using std::task::Wake instead of futures_task::ArcWake or cloning the waker, both options involve extra Arc clones to work (&Arc<impl std::task::Wake> can't be directly converted into impl Deref<Target = Waker> like futures_task::waker_ref does).

futures-task can mitigate this by putting #[inline(always)] on clone_arc_raw.

sdroege commented 5 months ago

futures-task can mitigate this by putting #[inline(always)] on clone_arc_raw.

That seems like a solution that will just fail again some time in the future. This sounds to me like will_wake() should actually only check the pointer for equality and not the vtable.

async-tungstenite can mitigate this by using std::task::Wake instead of futures_task::ArcWake or cloning the waker

Doing this as a workaround for now seems acceptable. Do you want to provide a PR for that?

sdroege commented 5 months ago

Intentionally keeping this open here so there's a reminder to revert the change once that's safe to do again.

timotheyca commented 2 months ago

https://github.com/rust-lang/futures-rs/commit/b92f4c55a3f8e9ffd2201b7a2c1827343339c92e

timotheyca commented 1 month ago

Tested with futures 0.3.31 and async-tungstenite =0.26.1, the problem does not seem to appear, unlike with futures =0.3.30 and async-tungstenite =0.26.1.

sdroege commented 1 month ago

Thanks, I'll update the minimum version for futures with the next release and drop the workaround