Closed comzyh closed 7 years ago
I got this error while trying to debug a connection issue by restarting rabbitmq. On the producer side, I was getting channel closed errors, but on the consumer side, I received large amounts of these.
Arf, py35. Could be why we've never seen this before. Any chance either of you could try your use case with 3.4 to confirm the impact?
Cheers
Python 3.5 StreamReaderProtocol sets self._stream_reader to None in connection_close:
def connection_lost(self, exc):
if self._stream_reader is not None:
if exc is None:
self._stream_reader.feed_eof()
else:
self._stream_reader.set_exception(exc)
super().connection_lost(exc)
self._stream_reader = None
self._stream_writer = None
The code needs to check if self._stream_reader is None (or infer that the connection has been closed some other way). I think this issue will be triggered by the task in run()/dispatch_frame() waiting for data to be read from the socket, and then the connection dies.
If the self.worker task (which is where the read_frame exception happens) would be cancelled on connection close, this would be avoided for example. It's also something that should be done to avoid warnings from asyncio, as now the task is left running indefinitely (though it dies on its own to that same exception when the transport is closed, the task is just never cleaned up).
In my local version, I've added this before any code where self.reader
is used:
if not self.reader:
logger.warning('No reader found.')
raise exceptions.ChannelClosed()
and then added code to catch the exception where those methods are used, though I don't seem to have caught it everywhere it shows up. If anyone could help with letting me know if that's the proper way to check for it, and if so, at what level of the code I should catch them (in .protocol? outside the package where I use it?) I would really appreciate it. This, or some other way to handle the error, should also be added to the code here once it's been tested.
I actually only added this exception catching to .protocol.py
's run()
method:
except exceptions.ChannelClosed as exc:
logger.error('protocol.py: Channel closed, close connection')
self.stop_now.set_result(None)
self._close_channels(exception=exc)
after the except
defined for exceptions.AmqpClosedConnection
and that seems to have worked, though I haven't been able to induce the error state reliably in order to test it thoroughly.
So this issue may still happen further along in the same function (there's another call to readexactly()
), the first one is the most likely to cause issues. Let's see if this fix helps and we can fix further fall out later.
Thanks
Hi! When you plan to release this bugfix?
This Error occur randomly, but it does occure many times. And because this error pollute logs, I can't locate the cause of this error.
Sorry for that I can't make reproduce this time. Just let you know about it.