jdelic / django-dbconn-retry

Patches Django to reconnect on a failed database connection once before failing. Helping with running Django ORM through HAProxy, for example.
BSD 3-Clause "New" or "Revised" License
75 stars 25 forks source link

Reconnection fails on first attempt #5

Open carlfischerjba opened 4 years ago

carlfischerjba commented 4 years ago

I'm running Django and Postgres in two separate Docker containers. If I restart the Postgres container to simulate a connection problem, the connection is only reestablished in a Django shell on the second attempt to run a query.

>>> User.objects.first()  # This works.
<User: admin>
>>> # restart database to force loss of connection
>>> User.objects.first()  # This first attempt fails.
Traceback (most recent call last):
  File "/venv/lib64/python3.6/site-packages/django/db/backends/utils.py", line 84, in _execute
    return self.cursor.execute(sql, params)
psycopg2.OperationalError: server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "<console>", line 1, in <module>
  File "/venv/lib64/python3.6/site-packages/django/db/models/manager.py", line 82, in manager_method
    return getattr(self.get_queryset(), name)(*args, **kwargs)
  File "/venv/lib64/python3.6/site-packages/django/db/models/query.py", line 653, in first
    for obj in (self if self.ordered else self.order_by('pk'))[:1]:
  File "/venv/lib64/python3.6/site-packages/django/db/models/query.py", line 274, in __iter__
    self._fetch_all()
  File "/venv/lib64/python3.6/site-packages/django/db/models/query.py", line 1242, in _fetch_all
    self._result_cache = list(self._iterable_class(self))
  File "/venv/lib64/python3.6/site-packages/django/db/models/query.py", line 55, in __iter__
    results = compiler.execute_sql(chunked_fetch=self.chunked_fetch, chunk_size=self.chunk_size)
  File "/venv/lib64/python3.6/site-packages/django/db/models/sql/compiler.py", line 1100, in execute_sql
    cursor.execute(sql, params)
  File "/venv/lib64/python3.6/site-packages/django/db/backends/utils.py", line 67, in execute
    return self._execute_with_wrappers(sql, params, many=False, executor=self._execute)
  File "/venv/lib64/python3.6/site-packages/django/db/backends/utils.py", line 76, in _execute_with_wrappers
    return executor(sql, params, many, context)
  File "/venv/lib64/python3.6/site-packages/django/db/backends/utils.py", line 84, in _execute
    return self.cursor.execute(sql, params)
  File "/venv/lib64/python3.6/site-packages/django/db/utils.py", line 89, in __exit__
    raise dj_exc_value.with_traceback(traceback) from exc_value
  File "/venv/lib64/python3.6/site-packages/django/db/backends/utils.py", line 84, in _execute
    return self.cursor.execute(sql, params)
django.db.utils.OperationalError: server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.

>>> User.objects.first()  # This second attempt succeeds.
<User: admin>
>>> django.__version__
'2.2.5'
carlfischerjba commented 4 years ago

I commented out the django_dbconn_retry app to confirm that it's actually being used. As far as i can tell, it is because the exception that occurs on the first attempt to contact the db is the same in both cases (django.db.utils.OperationalError: server closed the connection unexpectedly). But the second attempt and all subsequent attempts show the slightly different psycopg2.InterfaceError: connection already closed. This error never occurs when using django_dbconn_retry.

>>> User.objects.all()
<QuerySet [<User: admin>]>
>>> User.objects.all()
Traceback (most recent call last):
  File "/venv/lib64/python3.6/site-packages/django/db/backends/utils.py", line 84, in _execute
    return self.cursor.execute(sql, params)
psycopg2.OperationalError: server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "<console>", line 1, in <module>
  File "/venv/lib64/python3.6/site-packages/django/db/models/query.py", line 250, in __repr__
    data = list(self[:REPR_OUTPUT_SIZE + 1])
  File "/venv/lib64/python3.6/site-packages/django/db/models/query.py", line 274, in __iter__
    self._fetch_all()
  File "/venv/lib64/python3.6/site-packages/django/db/models/query.py", line 1242, in _fetch_all
    self._result_cache = list(self._iterable_class(self))
  File "/venv/lib64/python3.6/site-packages/django/db/models/query.py", line 55, in __iter__
    results = compiler.execute_sql(chunked_fetch=self.chunked_fetch, chunk_size=self.chunk_size)
  File "/venv/lib64/python3.6/site-packages/django/db/models/sql/compiler.py", line 1100, in execute_sql
    cursor.execute(sql, params)
  File "/venv/lib64/python3.6/site-packages/django/db/backends/utils.py", line 67, in execute
    return self._execute_with_wrappers(sql, params, many=False, executor=self._execute)
  File "/venv/lib64/python3.6/site-packages/django/db/backends/utils.py", line 76, in _execute_with_wrappers
    return executor(sql, params, many, context)
  File "/venv/lib64/python3.6/site-packages/django/db/backends/utils.py", line 84, in _execute
    return self.cursor.execute(sql, params)
  File "/venv/lib64/python3.6/site-packages/django/db/utils.py", line 89, in __exit__
    raise dj_exc_value.with_traceback(traceback) from exc_value
  File "/venv/lib64/python3.6/site-packages/django/db/backends/utils.py", line 84, in _execute
    return self.cursor.execute(sql, params)
django.db.utils.OperationalError: server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.

>>> User.objects.all()
Traceback (most recent call last):
  File "/venv/lib64/python3.6/site-packages/django/db/backends/base/base.py", line 235, in _cursor
    return self._prepare_cursor(self.create_cursor(name))
  File "/venv/lib64/python3.6/site-packages/django/db/backends/postgresql/base.py", line 223, in create_cursor
    cursor = self.connection.cursor()
psycopg2.InterfaceError: connection already closed

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "<console>", line 1, in <module>
  File "/venv/lib64/python3.6/site-packages/django/db/models/query.py", line 250, in __repr__
    data = list(self[:REPR_OUTPUT_SIZE + 1])
  File "/venv/lib64/python3.6/site-packages/django/db/models/query.py", line 274, in __iter__
    self._fetch_all()
  File "/venv/lib64/python3.6/site-packages/django/db/models/query.py", line 1242, in _fetch_all
    self._result_cache = list(self._iterable_class(self))
  File "/venv/lib64/python3.6/site-packages/django/db/models/query.py", line 55, in __iter__
    results = compiler.execute_sql(chunked_fetch=self.chunked_fetch, chunk_size=self.chunk_size)
  File "/venv/lib64/python3.6/site-packages/django/db/models/sql/compiler.py", line 1098, in execute_sql
    cursor = self.connection.cursor()
  File "/venv/lib64/python3.6/site-packages/django/db/backends/base/base.py", line 256, in cursor
    return self._cursor()
  File "/venv/lib64/python3.6/site-packages/django/db/backends/base/base.py", line 235, in _cursor
    return self._prepare_cursor(self.create_cursor(name))
  File "/venv/lib64/python3.6/site-packages/django/db/utils.py", line 89, in __exit__
    raise dj_exc_value.with_traceback(traceback) from exc_value
  File "/venv/lib64/python3.6/site-packages/django/db/backends/base/base.py", line 235, in _cursor
    return self._prepare_cursor(self.create_cursor(name))
  File "/venv/lib64/python3.6/site-packages/django/db/backends/postgresql/base.py", line 223, in create_cursor
    cursor = self.connection.cursor()
django.db.utils.InterfaceError: connection already closed
CharlesB2 commented 2 years ago

This is because the connection is marked as active, although closed by the server, and it's not a use case handled by the patch.

I added the following at the beginning of the function to handle the case where a connection is obsolete and needs to be closed (provided you have set the CONN_MAX_AGE django setting:

if (
    self.connection is not None
    and self.close_at is not None
    and time.monotonic() >= self.close_at
):
    _log.debug("closing obsolete connection")
    self.close()

This won't handle a restarted server, but at least it handles connections closed by the server during a long request.