When an amqplib 1.0.2 connection is asked to close:
1. It sends message (10, 60).
2. It receives message (10, 61).
3. This invokes ._close_ok().
4. Which invokes ._do_close().
5. Which calls .transport.close().
6. Which attempts self.sock.shutdown(...).
This creates an exciting race condition. Which will happen first? Will the
socket .shutdown() method be successfully called on a still-open socket? Or
will the FIN packet from the server (RabbitMQ, in my case) arrive fast enough
to mark the socket as closed before Python gets around to invoking the socket's
shutdown() system call?
On Linux and Windows, this is an entirely safe race condition — both
operating systems are very forgiving about calling shutdown() on a closed
socket, so the above sequence always works.
But on Mac OS X — a BSD variant — the above code only succeeds if the
operating system thinks that the socket is still open when shutdown() is
invoked. If Python is too slow and the FIN packet arrives before that statement
can be reached, then OS X kills the self.sock.shutdown() statement with:
socket.error: [Errno 57] Socket is not connected
As there seems to be no way to check whether a socket is closed — nor, I
guess, would it help, since the socket could close between such a check and the
actual shutdown() call! So there are two possibilities here:
1. Protect shutdown() with a try…except that catches the socket.error, tests
to make sure Errno is right, and ignores it if Errno matches.
2. Ditch the shutdown() altogether. Don't all modern OS's perform a shutdown()
on a closed socket anyway? I am suspicious of the claim in the transport.py
comment that outgoing data could be lost; I would want to see evidence that any
modern TCP/IP stack behaves in that way. I cannot find a recommendation to use
shutdown() to avoid data loss on Stack Overflow, for example.
Thanks so much for amqplib — I am using it with greenlet/eventlet and the
project is going very well!
Original issue reported on code.google.com by brandon....@gmail.com on 11 Nov 2011 at 9:11
Original issue reported on code.google.com by
brandon....@gmail.com
on 11 Nov 2011 at 9:11