gmr / rabbitpy

A pure python, thread-safe, minimalistic and pythonic RabbitMQ client library
http://rabbitpy.readthedocs.org
BSD 3-Clause "New" or "Revised" License
242 stars 58 forks source link

Swallowed exceptions in heartbeat thread causing hangs #135

Open mb-syss opened 3 years ago

mb-syss commented 3 years ago

We are occasionally seeing the error below (2.0.1), and afterwards messaging hangs. We have some reconnection logic in place, but in that situation no exception seems to reach the user code at all.

From what I understand browsing the code, if the heartbeat check runs at the right (or rather: wrong) time _check_for_exceptions will remove the exception from the queue and throw it in the heartbeat thread without handling. Also the connection/channel state does not seem to be updated. Therefore user code will not notice the closed connection/dead IO thread and hang indefinitely.

[2021-08-27 12:36:40,207] [rabbitpy.io:CRITICAL] In on_error: 'Connection reset'
Exception in thread Thread-2:
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/threading.py", line 932, in _bootstrap_inner
    self.run()
  File "/usr/local/lib/python3.8/threading.py", line 1254, in run
    self.function(*self.args, **self.kwargs)
  File "/usr/local/lib/python3.8/site-packages/rabbitpy/heartbeat.py", line 59, in _maybe_send
    self._channel0.send_heartbeat()
  File "/usr/local/lib/python3.8/site-packages/rabbitpy/channel0.py", line 140, in send_heartbeat
    self.write_frame(heartbeat.Heartbeat())
  File "/usr/local/lib/python3.8/site-packages/rabbitpy/base.py", line 251, in write_frame
    if self._can_write():
  File "/usr/local/lib/python3.8/site-packages/rabbitpy/base.py", line 285, in _can_write
    self._check_for_exceptions()
  File "/usr/local/lib/python3.8/site-packages/rabbitpy/base.py", line 313, in _check_for_exceptions
    raise exception
rabbitpy.exceptions.ConnectionResetException: Connection was reset at socket level
matteogrolla commented 1 year ago

Hi, we have a similar issue on 2.0.1 under python 3.10. It's actually very bad because it hangs our producer till manual intervention. Is there any solution or workaround? thanks

{ "name": "rabbitpy.io", "levelname": "CRITICAL", "message": "In on_error: ConnectionResetError(104, 'Connessione interrotta dal corrispondente')", "time": "2022-09-15T08:23:57.861858" }

Exception in thread Thread-1996: Traceback (most recent call last): File "/usr/local/lib/python3.10/threading.py", line 1009, in _bootstrap_inner self.run() File "/usr/local/lib/python3.10/threading.py", line 1358, in run self.function(*self.args, **self.kwargs) File "/root/.local/share/virtualenvs/app-4PlAip0Q/lib/python3.10/site-packages/rabbitpy/heartbeat.py", line 59, in _maybe_send self._channel0.send_heartbeat() File "/root/.local/share/virtualenvs/app-4PlAip0Q/lib/python3.10/site-packages/rabbitpy/channel0.py", line 140, in send_heartbeat self.write_frame(heartbeat.Heartbeat()) File "/root/.local/share/virtualenvs/app-4PlAip0Q/lib/python3.10/site-packages/rabbitpy/base.py", line 251, in write_frame if self._can_write(): File "/root/.local/share/virtualenvs/app-4PlAip0Q/lib/python3.10/site-packages/rabbitpy/base.py", line 285, in _can_write self._check_for_exceptions() File "/root/.local/share/virtualenvs/app-4PlAip0Q/lib/python3.10/site-packages/rabbitpy/base.py", line 313, in _check_for_exceptions raise exception rabbitpy.exceptions.ConnectionResetException: Connection was reset at socket level