quinn-rs / quinn

Async-friendly QUIC implementation in Rust
Apache License 2.0
3.76k stars 380 forks source link

bug: hang on read after connection close #1564

Closed stevenh closed 1 year ago

stevenh commented 1 year ago

There seems to be some sort of race in handling Connection::close with RecvStream::read.

We have a client which when told to quit closes the underlying connection of a RecvStream SendStream pair which are handled by separate thread. While the sending thread always correctly returns an error from write calls the read often hangs waiting for the return of the read which seems to block, triggering our shutdown timeout which is currently 5 seconds.

Good trace:

 INFO myapp: Client shutting down...
TRACE quinn_proto::connection: connection closed
 INFO myapp::output_audio: connection closed
TRACE drive:send: quinn_proto::connection: sending CONNECTION_CLOSE id=0 space=Data pn=67
TRACE drive:send: quinn_proto::connection: ACK ArrayRangeSet([36..37]), Delay = 0us id=0 space=Data pn=67
TRACE drive: quinn_proto::connection: sending 44 bytes in 1 datagrams id=0
 INFO myapp::input_audio: connection closed
 INFO myapp: Client shutdown

Hang trace:

 INFO myapp: Client shutting down...
TRACE quinn_proto::connection: connection closed
 INFO myapp::output_audio: connection closed
TRACE drive:send: quinn_proto::connection: sending CONNECTION_CLOSE id=0 space=Data pn=65
TRACE drive:send: quinn_proto::connection: ACK ArrayRangeSet([35..36]), Delay = 0us id=0 space=Data pn=65
TRACE drive: quinn_proto::connection: sending 44 bytes in 1 datagrams id=0
TRACE drive: quinn_proto::connection: timeout timer=Close id=0
TRACE mio::poll: deregistering event source from poller
 WARN myapp: Thread did not exit, aborting! name=Input timeout=5s
 INFO myapp: Client shutdown

The key difference is the last two lines which don't appear the successful case.

TRACE drive: quinn_proto::connection: timeout timer=Close id=0
TRACE mio::poll: deregistering event source from poller

I suspect what's happening underneath is the server is processing the CONNECTION_CLOSE and shutting down its writer resulting in the read never returning. In the success case I suspect the app is processing the data from a read so not actually in the call to block.

As documented by Connection::close it should unblock all operations and return immediately from the RecvStream::read

Pending operations will fail immediately with ConnectionError::LocallyClosed.

djc commented 1 year ago

Is this with 0.9 or 0.10 (released yesterday)? If 0.9, can you try with 0.10 (it might well still have the same issue)? If this is a 0.10 regression, that would be useful information too.

stevenh commented 1 year ago

It was with 0.9, trying with 0.10.1 now, will report back.

stevenh commented 1 year ago

Closing as invalid, there was internal issue bocking reads.

djc commented 1 year ago

Happy to hear that!