Closed kim closed 2 years ago
Just checking I understand this correctly. The hypothesis is that the deadlock in #772 is caused by the following sequence of events:
librad::net::quic::connection::tracking::Conntrack::disconnect
disconnect
calls librad::net::quic::connection::Connection::close
librad::net::quic::connection::tracking::Conntrack::disconnect
again, which deadlocks because there is already a lock held on the ConnTrack::connections
dashmap (by the first call to disconnect
in 1
)This change fixes things by changing 2
to just directly calls close on the underlying quic connection. Is that right?
Yep. This can only happen from the “GC” thread, when a connection is idle timed out. Which, if we got our must-run-network-future-to-completion business mostly right by now, should be quite rare.
That said, not ever calling close on the inner connection is a bug in itself.
This fixes a regression introduced in 760d3101 (net: store the actual connection in conntrack, 2021-07-29), which would cause Dashmap to deadlock.
Fixes #772 Signed-off-by: Kim Altintop kim@eagain.st