zodb / relstorage

A backend for ZODB that stores pickles in a relational database.
Other
54 stars 46 forks source link

Unhandled disconnect errors with psycopg2 and pgBouncer #412

Closed sjustas closed 3 years ago

sjustas commented 4 years ago

pgBouncer, a connection pooler for PostgreSQL, tends to send SQL 08P01 PROTOCOL_VIOLATION when it unexpectedly looses connection to the server, or when an attempt to connect times out.

psycopg2 classifies ProtocolViolation a DatabaseError, so it's not re-tried in poll_invalidations in a newTransaction.

Since pgBouncer is a common way to scale PostgreSQL, maybe it's worth adding ProtocolViolation to psycopg2 driver's disconnected_exceptions?

Or maybe even add whole DatabaseError family, to cover all connection failures: "Class 08: Connection Exception" https://www.psycopg.org/docs/errors.html

This probably affects pg8000 driver too.

Traceback (most recent call last):
  File ".../ve3.7/lib/python3.7/site-packages/celery/app/trace.py", line 385, in trace_task
    R = retval = fun(*args, **kwargs)
  <...>
  File ".../src/app/taskcontext.py", line 39, in celery_request
    conn = db.open()
  File ".../ve3.7/lib/python3.7/site-packages/ZODB/DB.py", line 793, in open
    result.open(transaction_manager)
  File ".../ve3.7/lib/python3.7/site-packages/ZODB/Connection.py", line 921, in open
    self.newTransaction(None, False)
  File ".../ve3.7/lib/python3.7/site-packages/ZODB/Connection.py", line 737, in newTransaction
    invalidated = self._storage.poll_invalidations()
  File ".../ve3.7/lib/python3.7/site-packages/relstorage/storage.py", line 1421, in poll_invalidations
    changes, new_polled_tid = self._restart_load_and_poll()
  File ".../ve3.7/lib/python3.7/site-packages/relstorage/storage.py", line 1395, in _restart_load_and_poll
    self._adapter.poller.poll_invalidations, prev, ignore_tid)
  File ".../ve3.7/lib/python3.7/site-packages/relstorage/storage.py", line 365, in _restart_load_and_call
    return f(self._load_conn, self._load_cursor, *args, **kw)
  File ".../ve3.7/lib/python3.7/site-packages/relstorage/adapters/poller.py", line 95, in poll_invalidations
    cursor.execute(self.poll_query)
psycopg2.errors.ProtocolViolation: server conn crashed?
server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.

Originating from https://github.com/pgbouncer/pgbouncer/blob/3c80e38193c8422a91d43bf919cdbeceea2b578f/src/server.c#L481

Another ProtocolViolation I've seen is query_wait_timeout (a pgBouncer setting https://www.pgbouncer.org/config.html)

psycopg2.errors.ProtocolViolation: query_wait_timeout
server closed the connection unexpectedly
    This probably means the server terminated abnormally
    before or while processing the request.
sjustas commented 3 years ago

On the other hand, drivers should conform to PEP 249 better, and raise OperationalError in this case.

Made a psycopg2 PR https://github.com/psycopg/psycopg2/pull/1148