agoragames / haigha

AMQP Python client
BSD 3-Clause "New" or "Revised" License
160 stars 41 forks source link

Increase read timeout to twice the heartbeat interval #78

Closed liorn closed 9 years ago

liorn commented 9 years ago

Haigha should wait two heartbeat intervals before giving up on the read request, according to the AMQP 0.9.1 spec. See "4.2.7 - Heartbeat frames":

If a peer detects no incoming traffic (i.e. received octets) for two heartbeat intervals or longer, it should close the connection without following the Connection.Close/Close-Ok handshaking, and log an error.

@awestendorf, I think that in the future the entire behavior should be changed when this timeout expires. According to AMQP, when nothing is read for two heartbeat intervals, the connection should be closed - today, haigha ignores it (returns None from SocketTransport when it expires). This causes dead connections to keep on living on client side unnecessarily.

Tell me if you think this change should not be done. I'm planning additional pull requests to address this soon.

Thanks!

liorn commented 9 years ago

@vitaly-krugl @awestendorf Take a look at the new commit. I've improved my first one - now the client sends a heartbeat every _heartbeat seconds, but correctly waits for 2 heartbeat intervals before closing the connection.

awestendorf commented 9 years ago

Thank you for the change, this definitely should be added.

I think that rather than relying on the socket to govern the logic, you should track time directly. There are cases where a socket read could be interrupted without actually passing bytes back to the application. These are few and far between, but in the interests of getting the logic correct, I recommend that be taken into consideration.

Something like this perhaps, with _last_octet_time initialized when the socket connection is established (before the handshake):

data = self._transport.read(self._heartbeat)
c_time = time.time()

if data is None:
    if self._heartbeat and (c_time-(self._last_octet_time or c_time)>2*self._heartbeat):
        self.close(reply_code=501,
                   reply_text='Heartbeats not received for %d seconds' % 2*self._heartbeat,
               class_id=0, method_id=0, disconnect=False)
        raise ConnectionClosed("connection is closed: %s : %s" %
                         (self._close_info['reply_code'],
                          self._close_info['reply_text']))

    return

self._last_octet_time = c_time
liorn commented 9 years ago

@awestendorf Done. Take a look.

awestendorf commented 9 years ago

@liorn looks good, I just had a couple of suggestions

liorn commented 9 years ago

Pushed. Thanks for the review @awestendorf.

liorn commented 9 years ago

Any timeline for next haigha version. @awestendorf ?

awestendorf commented 9 years ago

I merged and when Travis passes I'll release this with #73.

I forgot to ask to flatten the commits, but for future reference, that would be helpful. Thanks for implementing this!

awestendorf commented 9 years ago

Before release I went ahead and squashed those commits into a5173df86f23e205a7ab8504d03b532260e376d6