Azure / amqpnetlite

AMQP 1.0 .NET Library
Apache License 2.0
396 stars 141 forks source link

Disconnected TCP sessions are not detected #510

Closed johansme closed 2 years ago

johansme commented 2 years ago

We're having an issue where the TCP connections underlying the connections of our ReceiverLinks are closed by the remote host - i.e. the message broker. The connections being closed, however, is not the real problem. Our problem is that neither ReceiverLink, Session or Connection discover that the connection has closed for about 2 minutes. In the meantime, IsClosed returns false for all of them, and ReceiverLink.ReceiveAsync returns null.

The same issue occurs if the client loses network connection completely. If I pull out the Ethernet cable, the receiver will continue to operate without complaint for about two minutes - though, of course, unable to actually receive anything.

Another part of the issue is that, when the TCP connection has been closed remotely - i.e. we have received a TCP FIN-package - the resources are not cleaned up until those 2 minutes have expired. In other words, the connection is stuck in the CLOSE_WAIT state. This should have been discovered immediately, or at the latest at the first call to ReceiverLink.ReceiveAsync - and the corresponding Connection-object should be closed (and the TCP connection finalized with a FIN/ACK).

We use the default constructors for all the objects, without specifying any Begin or OnAttached, and run ReceiverLink.ReceiveAsync with a 100 millisecond timeout in a while loop - checking IsClosed for each iteration.

xinchen10 commented 2 years ago

Connection detects a TCP connection is broken when an I/O call is returned with a socket exception. Based on your description, the Socket implementation in the framework you are using did not fail the I/O call until 2 minutes later. This is something below the library and, if you are interested, you can research why it behaves like this on the platform your app runs. The library does provide some mechanisms to help detect a broken connection sooner. For example, you could set a smaller value of the TCP keep-alive settings, and/or the connection idle timeout.