tournesol-app / tournesol

Free and open source code of the https://tournesol.app platform. Meet the community on Discord https://discord.gg/WvcSG55Bf3
https://tournesol.app
Other
330 stars 48 forks source link

Server errors on high load due to too many connections to PostgreSQL #1716

Open amatissart opened 1 year ago

amatissart commented 1 year ago

Problem

The number of concurrent requests that can be handled by the backend is currently limited by the number of connections to the database:

django.db.utils.OperationalError: connection to server at "localhost" (::1), port 5432 failed: FATAL:  remaining connection slots are reserved for non-replication superuser connections

This issue has been observed when a large number of clients tries to fetch preview images simultaneously: a connection to the database is required by the throttling mechanism configured using DRF + PostgreSQL as the cache.

The maximum number of connections allowed by the db configuration is already quite high (100). And configuring persistent db connections in Django with CONN_MAX_AGE is not appropriate because of a limitation with ASGI in Django >= 4.0: https://code.djangoproject.com/ticket/33497

Potential solutions

Solution 1

Keep the current async UvicornWorker with Django ASGI app, and use a third-party tool to configure a poll of connections to the database. For example, django-postgrespool2 provides such PostgreSQL connection pooling for Django.

Pros

Cons:

Solution 2

Switch from ASGI to WSGI and replace "uvicorn" worker. Using the gunicorn "gthread" worker class would probably be suitable for us. Then the number of concurrent requests to be executed by each worker can be configured with "threads".

Pros:

Cons:


My preference goes to Solution 2

GresilleSiffle commented 1 year ago

I also have a preference for the solution 2.