Open timotheyca opened 5 months ago
If you do the same with tokio, does it behave correctly?
I've tested it with async-net
(versions 1
and 2
), and that worked okay;
will test with tokio
now;
ed.: tokio
is fine
This, to me, looks a lot like some sort of soft race condition, because it goes away if I add some overhead (or run the thing in debug
) or do anything that causes poll be called not as soon as it wakes, and happens more often if I add this at the very start of main
:
async_std::task::spawn(futures_util::future::pending::<()>());
Update: this seems to affect anything that uses async_io::Async::<std::net::TcpStream>::poll_read
directly. Happens both on versions 1
and 2
I found the reason (async_tungstenite::compat
):
let waker = match kind {
ContextWaker::Read => task::waker_ref(&self.read_waker_proxy),
ContextWaker::Write => task::waker_ref(&self.write_waker_proxy),
};
and the "fix" (for now I can't figure out much better):
let waker = match kind {
ContextWaker::Read => task::waker_ref(&self.read_waker_proxy),
ContextWaker::Write => task::waker_ref(&self.write_waker_proxy),
}
.clone();
why:
Waker
returned by task::waker_ref
changes its vtable on cloneasync-io
's Source
, just like AtomicWaker
, checks Waker
equality by Waker::will_wake
which compares both the pointer and the vtableasync-io
's Source
wakes the older Waker
on changeas for why async-net
works: I have no clue
Okay, so I've added a print:
let waker = match kind {
ContextWaker::Read => task::waker_ref(&self.read_waker_proxy),
ContextWaker::Write => task::waker_ref(&self.write_waker_proxy),
};
eprintln!("{waker:?} {:?}", waker.clone());
In debug builds (some parts omitted):
... data: 0x563a8f431fa0, vtable: 0x563a8e701368 ... data: 0x563a8f431fa0, vtable: 0x563a8e701368 ...
In release builds on 1.78/1.79:
... data: 0x563ac7b8cf30, vtable: 0x563ac5b80238 ... data: 0x563ac7b8cf30, vtable: 0x563ac5b80178 ...
???????
compiler bug?
Seems like it's making the same vtable be at different addresses.
ed.: it was indeed a compiler bug: https://github.com/rust-lang/futures-rs/pull/2829#issuecomment-1962863389
There's been a change in how Waker
works. std::task::Wake
has adapted to it and futures::task::ArcWake
has not. I've checked with both std::task::Wake
and with a patched version of futures::task::ArcWake
, both work fine. So, ig, if there's a need to mitigate this without waiting for futures-task
to be changed, switching over std::task::Wake
is a possibility at the cost of some extra atomic operations (std::task
doesn't provide an equivalent to waker_ref
, as far as I see).
I don't have time to look into this in more detail until some time next week, but this sounds like a bug in the code doing the comparison. If multiple codegen units are involved, it's not guaranteed that the vtable is always the same for the same object.
That's why e.g. https://doc.rust-lang.org/stable/std/sync/struct.Arc.html#method.ptr_eq exists.
I'll list all the parts of code I found relevant so far in one place, and with less guesses
async-tungstenite
creating wakers that change their vtable on clone:
https://github.com/sdroege/async-tungstenite/blob/6ff28b380c55b59a6ca96a6d76528754fa7b50ff/src/compat.rs#L128-L129
async-std
directly uses async_io::Async
:
https://github.com/async-rs/async-std/blob/8947d8e03c4c267fdd3828e07368cefc7d39b002/src/net/tcp/stream.rs#L5
https://github.com/async-rs/async-std/blob/8947d8e03c4c267fdd3828e07368cefc7d39b002/src/net/tcp/stream.rs#L50
https://github.com/async-rs/async-std/blob/8947d8e03c4c267fdd3828e07368cefc7d39b002/src/net/tcp/stream.rs#L308
async-io
checking Waker
s for equality and waking the old one on change:
https://github.com/smol-rs/async-io/blob/master/src/reactor.rs#L456-L461
futures-task
creates vtables for original and cloned (note no #[inline(always)]
on clone_arc_raw
):
https://github.com/rust-lang/futures-rs/blob/master/futures-task/src/waker_ref.rs#L64
https://github.com/rust-lang/futures-rs/blob/master/futures-task/src/waker.rs#L38-L42
Patch to alloc::task
in Rust that uses #[inline(always)]
to avoid the split across CGUs:
https://github.com/rust-lang/rust/pull/121622
async-tungstenite
can mitigate this by using std::task::Wake
instead of futures_task::ArcWake
or cloning the waker
, both options involve extra Arc
clones to work (&Arc<impl std::task::Wake>
can't be directly converted into impl Deref<Target = Waker>
like futures_task::waker_ref
does).
futures-task
can mitigate this by putting #[inline(always)]
on clone_arc_raw
.
futures-task
can mitigate this by putting#[inline(always)]
onclone_arc_raw
.
That seems like a solution that will just fail again some time in the future. This sounds to me like will_wake()
should actually only check the pointer for equality and not the vtable.
async-tungstenite
can mitigate this by usingstd::task::Wake
instead offutures_task::ArcWake
or cloning thewaker
Doing this as a workaround for now seems acceptable. Do you want to provide a PR for that?
Intentionally keeping this open here so there's a reminder to revert the change once that's safe to do again.
Tested with futures 0.3.31
and async-tungstenite =0.26.1
, the problem does not seem to appear, unlike with futures =0.3.30
and async-tungstenite =0.26.1
.
Thanks, I'll update the minimum version for futures with the next release and drop the workaround
This may cause a CPU core to be constantly at
100%
after a client connects and sends nothing:(the same also happens after the handshake when the client is idle; using only
accept_async
as the example, because shorter)It could be an issue with
async-std
but I haven't yet been able to replicate it withoutasync-tungstenite
(both codebases are still a bit hard for me to navigate).