galaxyproject / pulsar

Distributed job execution application built for Galaxy
https://pulsar.readthedocs.io
Apache License 2.0
37 stars 50 forks source link

Add catchall OSError to recoverable exceptions #338

Closed mvdbeek closed 1 year ago

mvdbeek commented 1 year ago

I'd love if this wasn't so broad, but fixes

Sep 14 02:50:25 jetstream2.galaxyproject.org pulsar[3746252]: 2023-09-14 02:50:25,143 ERROR [pulsar.client.amqp_exchange][consume-setup-amqp://main_pulsar:********@amqp.galaxyproject.org:5671//main_pulsar?ssl=1] Problem consuming queue, consumer quitting in problematic fashion!
Sep 14 02:50:25 jetstream2.galaxyproject.org pulsar[3746252]: Traceback (most recent call last):
Sep 14 02:50:25 jetstream2.galaxyproject.org pulsar[3746252]:   File "/srv/pulsar/main/venv/lib64/python3.9/site-packages/pulsar/client/amqp_exchange.py", line 143, in consume
Sep 14 02:50:25 jetstream2.galaxyproject.org pulsar[3746252]:     connection.drain_events(timeout=self.__timeout)
Sep 14 02:50:25 jetstream2.galaxyproject.org pulsar[3746252]:   File "/srv/pulsar/main/venv/lib64/python3.9/site-packages/kombu/connection.py", line 316, in drain_events
Sep 14 02:50:25 jetstream2.galaxyproject.org pulsar[3746252]:     return self.transport.drain_events(self.connection, **kwargs)
Sep 14 02:50:25 jetstream2.galaxyproject.org pulsar[3746252]:   File "/srv/pulsar/main/venv/lib64/python3.9/site-packages/kombu/transport/pyamqp.py", line 169, in drain_events
Sep 14 02:50:25 jetstream2.galaxyproject.org pulsar[3746252]:     return connection.drain_events(**kwargs)
Sep 14 02:50:25 jetstream2.galaxyproject.org pulsar[3746252]:   File "/srv/pulsar/main/venv/lib64/python3.9/site-packages/amqp/connection.py", line 525, in drain_events
Sep 14 02:50:25 jetstream2.galaxyproject.org pulsar[3746252]:     while not self.blocking_read(timeout):
Sep 14 02:50:25 jetstream2.galaxyproject.org pulsar[3746252]:   File "/srv/pulsar/main/venv/lib64/python3.9/site-packages/amqp/connection.py", line 530, in blocking_read
Sep 14 02:50:25 jetstream2.galaxyproject.org pulsar[3746252]:     frame = self.transport.read_frame()
Sep 14 02:50:25 jetstream2.galaxyproject.org pulsar[3746252]:   File "/srv/pulsar/main/venv/lib64/python3.9/site-packages/amqp/transport.py", line 294, in read_frame
Sep 14 02:50:25 jetstream2.galaxyproject.org pulsar[3746252]:     frame_header = read(7, True)
Sep 14 02:50:25 jetstream2.galaxyproject.org pulsar[3746252]:   File "/srv/pulsar/main/venv/lib64/python3.9/site-packages/amqp/transport.py", line 582, in _read
Sep 14 02:50:25 jetstream2.galaxyproject.org pulsar[3746252]:     raise OSError('Server unexpectedly closed connection')
Sep 14 02:50:25 jetstream2.galaxyproject.org pulsar[3746252]: OSError: Server unexpectedly closed connection
natefoo commented 1 year ago

Edging ever closer to a bare except: :smile: