Open bnmnetp opened 10 months ago
@bjones1 -- would love your thoughts on this...
Sorry for the slow response! I'm heavily in favor of option 2. That makes so much sense and keeps the code a lot DRYer.
@bjones1 -- I have a prototype of option 2 on a branch. But there are some challenges.
A different solution came to me from mlhess at Umich. He uses pgbouncer
with his kubernetes setup. pgbouncer is a dedicated connection pooler for postgresql. (pgbouncer.org) We set up our database connection to connect to a local copy of pgbouncer
and it takes care of managing the connection pool and queuing requests. This would allow us to set a carefully chosen pool size for each worker. This feels really clean to me.
Any thoughs
I like this option, but does pgbouncer do what we want? If we use the session pooling mode (the default/most compatible), then it seems like it still allocates one server connection per client connection. I'm assuming we connect to the db on startup and don't terminate until shutdown.
Yes. Each worker would run a pgbouncer. Each pgbouncer would have 20 (configurable) connections to the database. The servers on the worker connect to pgbouncer. If there are more than 20 then it queues the requests before passing them on to the postgresql server.
I am planning to test it, by adding a new worker and having it run in this config.
Sounds great to me! Let me know what happens.
Each worker in the runestone swarm creates 65 database connections. This is too many and leads to postgresql issues.
The problem is that we create the session in the
async_session.py
file. which is then imported nearly anywhere.There are a couple of solutions. Both of which make use of the startup mechanism of FastAPI to create the session and pool and then store it in
app.state
so that we can later access it fromrequest.state
python - How to do persistent database connection in FastAPI? - Stack Overflow
How to maintain global pool of db connection and use it in each and every request? · Issue #1800 · tiangolo/fastapi · GitHub
The downside of this is that crud.py would need to change so that
request.app.state.db_session
object (which is the same as the async_session object we can import from async_session when we don't have a request)I guess the benefit of 2 is that we can avoid huge long lists of imported functions as they are all just methods now.