agoragames / haigha

AMQP Python client
BSD 3-Clause "New" or "Revised" License
161 stars 41 forks source link

Fix channel frame handling upon framing error #74

Closed vitaly-krugl closed 8 years ago

vitaly-krugl commented 9 years ago

This PR, together with PR #73, should fully address issue #72

Drop non-close/close_ok methods per AMQP stability rule when closing due to input framing error.

Fixed up/improved a number of tests, inluding two with duplicate names.

Don't override system-level exceptions (e.g., SystemExit, KeyboardException) in Channel.process_frames()

vitaly-krugl commented 9 years ago

@awestendorf: this PR, together with PR #73 should fully address issue #72

vitaly-krugl commented 9 years ago

Per http://www.rabbitmq.com/resources/specs/amqp0-9-1.extended.xml, re. channel.close:

After sending this method, any received methods except Close and Close-OK MUST be discarded. ...

awestendorf commented 9 years ago

I see what you're doing here and I like it, but I want to be sure that this doesn't break anything, so it will take some testing and review before I'm comfortable merging.

vitaly-krugl commented 9 years ago

@awestendorf, thanks for reviewing. This one is far less urgent for me than #73.

This basically fixes the secondary issues that I discovered and described when investigating issue #72:

  1. The channel was getting confused by incoming framing error and was unable to recover (internal exceptions resulting in recursive Channel.close calls, and finally hang)
  2. It was overriding system-exiting exceptions, preventing things such as SystemExit and KeyboardException from working properly.

In this PR, I took the cautious approach of only implementing the stability rule when closing the channel under the above-described framing error duress conditions.

I limited the scope to this change to the framing error scenario out of concern for creating an unintended side-effect under normal Channel.close conditions.

For example, in the basic.return failure scenario, the basic.return frames were getting chewed up and causing internal exceptions and Channel.close recursion, and if we didn't start dropping non-close-related frames, then the user code listening for Basic.Ack, would have its "ack listener" callback called, and would have mistakenly concluded that the message was delivered successfully and marked the operation as successfully-completed, which would be a mistake.

vitaly-krugl commented 9 years ago

UPDATE: Hi Aaron, we've been running with this change in production error-free 24/7 for several months now. Prior to this PR, Haigha would hang when encountering an unexpected frame as discovered with missing Basic.Return support and documented in issue #72. It was also overriding system-level exceptions (e.g., SystemExit, KeyboardException) in Channel.process_frames().

vitaly-krugl commented 8 years ago

rebased against current master

vitaly-krugl commented 8 years ago

@awestendorf: Hi Aaron, I addressed your outstanding concern about the use of the "naked" constants for class and method identifiers via 9ec77191667b65b7e65483ce18950022c4e53469 in this PR and rebased against the current master. I believe that the named constants are defined in the right place in channel_class.ChannelClass and their use is consistent with the use of similar AMQP protocol constants in that class/module.

awestendorf commented 8 years ago

@vitaly-krugl I forgot about this PR. I'd like to find a way to abstract this, so that the ChannelClass ids don't bleed into the Channel abstraction. I may just have to get over it and consider an alternative relationship, like I did with Connection and ConnectionChannel.

Since I've sat on this so long and my only hangups are with the abstraction, I'm going to merge it any way. It's more important for it to be technically correct than to wait for a potential improvement to the code structure. I can always revisit that at a later date.