celery / django-celery-results

Celery result back end with django
Other
696 stars 206 forks source link

Issue with Django DATABASES CONN_MAX_AGE #428

Open martinlehoux opened 7 months ago

martinlehoux commented 7 months ago

From time to time, my postgres database is cluttered with idle connections, and it can lead to new connections failures, which make my application unavailable.

So my solution was to setup idle_session_timeout and idle_in_transaction_session_timeout to 15min on my database, which should be OK because Django is configured with CONN_MAX_AGE = 600 (10min).

But now my worker, which runs with --pool gevent, has failures from time to time, and the stack trace shows

# celery/worker/request.py
self.task.backend.mark_as_revoked(...)
...
# django_celery_results/backends/database.py
self.TaskModel._default_manager.store_result(**task_props)
...
# psycopg/connection.py
def _check_connection_ok(...):
    raise e.OperationalError("the connection is closed")

My understanding is that the connection has been closed by the database before the application has closed it. My investigation showed me that Django manages the connection life-cycle with 2 signals

# django/db/__init__.py
def close_old_connections(**kwargs):
    for conn in connections.all(initialized_only=True):
        conn.close_if_unusable_or_obsolete()

signals.request_started.connect(close_old_connections)
signals.request_finished.connect(close_old_connections)

But I couldn't find such a mechanism in Django Celery Results. So my questions:

martinlehoux commented 6 months ago

I have done some more research, and it seems the issue is deeper with gevent workers / web workers. There might be improvements to see with Django 5.1 use of psycopg pool, but for now i will try to switch to persistent connection and remove the cleanup of idle sessions from my postgres database.