freelawproject / courtlistener

A fully-searchable and accessible archive of court data including growing repositories of opinions, oral arguments, judges, judicial financial records, and federal filings.
https://www.courtlistener.com
Other
544 stars 150 forks source link

Many database issues after enabling connection pools #4456

Open sentry-io[bot] opened 1 month ago

sentry-io[bot] commented 1 month ago

We got a lot of errors in Sentry after enabling connection pools in Django 5.1 with psycopg3. The one below is representative, and I'll see if I can link others.

For now, I'm just going to disable connection pools until we have a theory of what's going on.

We're using gunicorn with the following command:

    exec gunicorn cl.asgi:application \
        --chdir /opt/courtlistener/ \
        --user www-data \
        --group www-data \
        --workers ${NUM_WORKERS:-48} \
        --worker-class cl.workers.UvicornWorker \
        --limit-request-line 6000 \
        --timeout 0 \
        --bind 0.0.0.0:8000

And the custom worker we have there is:

from uvicorn.workers import UvicornWorker as BaseUvicornWorker

class UvicornWorker(BaseUvicornWorker):
    CONFIG_KWARGS: Dict[str, Any] = {
        "loop": "auto",
        "http": "auto",
        "lifespan": "off",
    }

We use postgres in AWS RDS.

OperationalError: couldn't get a connection after 30.00 sec

Sentry Issue: COURTLISTENER-85G

PoolTimeout: couldn't get a connection after 30.00 sec
(1 additional frame(s) were not displayed)
...
  File "django/utils/asyncio.py", line 26, in inner
    return func(*args, **kwargs)
  File "django/db/backends/base/base.py", line 256, in connect
    self.connection = self.get_new_connection(conn_params)
  File "django/utils/asyncio.py", line 26, in inner
    return func(*args, **kwargs)
  File "django/db/backends/postgresql/base.py", line 330, in get_new_connection
    connection = self.pool.getconn()
  File "psycopg_pool/pool.py", line 202, in getconn
    raise PoolTimeout(

OperationalError: couldn't get a connection after 30.00 sec
(17 additional frame(s) were not displayed)
...
  File "cl/lib/celery_utils.py", line 181, in wrapper
    return func(*args, **kwargs)
  File "cl/scrapers/tasks.py", line 424, in update_docket_info_iquery
    d = Docket.objects.get(pk=d_pk, court_id=court_id)
sentry-io[bot] commented 1 month ago

Sentry Issue: COURTLISTENER-84X

consuming input failed: terminating connection due to administrator command
SSL connection has been closed unexpectedly
albertisfu commented 1 month ago

I performed some tests related to these issues and was able to reproduce the following error locally:

OperationalError
consuming input failed: terminating connection due to administrator command
SSL connection has been closed unexpectedly

by setting the DB idle_session_timeout to a value smaller than max_idle.

However, @blancoramiro confirmed that this value is set to 0 for the production DB, meaning it's disabled, and connections shouldn't be terminated by the DB server.

Therefore, the issue could be caused by something else, which requires further investigation. I believe using the dev DB and simulating concurrent load to help identify the root cause and find a solution.

mlissner commented 1 month ago

Great, thanks. We could just ignore this, but it does seem like a good opportunity to help the community by being one of the first to solve it in the open.

I look forward to learning what you discover.