postgresml / pgcat

PostgreSQL pooler with sharding, load balancing and failover support.
MIT License
3.07k stars 183 forks source link

Single pool for multiple databases #806

Open nealoke opened 1 month ago

nealoke commented 1 month ago

Context We have a multi-tenant SaaS setup where each tenant has a database in a managed database cluster (Scaleway/DigitalOcean/...). Currently we are using a single cluster. The goal for us for using pgcat is that we can limit the amount of connections to our database cluster as the cluster has a limited amount of connections available.

Problem As we have one cluster we essentially only need one pgcat pool, the only issue I'm bouncing against is that it is impossible for us to allow multiple different databases inside a single pool. And if I were to create separate pools for each tenant, then the implementation with pgcat has no benefit as we can't limit the connections as a while towards the database cluster. A simple example:

Potential solution It would be awesome if we could allow a regex pattern for which databases should connect to a certain pool, in that way we could say that tenant-1 until tenant-50 should connect to pool 1 and tenant-51 until tenant-100 should connect to pool 2.

[general]
# ...

[pools]
[pools.postgres]
# ...

[pools.postgres.users.0]
# ...

[pools.postgres.shards.0]
servers = [["...", 5432, "primary"]]
database = "^tenant-(?:[1-9]|[1-4][0-9]|50)$"

Or is all this unneeded and am I missing something?

nealoke commented 1 month ago

I just noticed that we can have a shard per tenant and that would fix the issue where we can't limit the connections overall but these are not possible to be added through the pgcat admin api which would mean that we need to manually add them to the pgcat.toml file and restart the service everytime a tenant gets created which seems like a hassle and will cause interuptions in service of tenants using the application.

magec commented 1 month ago

Hey, I just read the issue. I see you've already kind of found the answer. Although both things seem 'doable' there is no way of having 'dynamic db names' on pools neither adding new pool config through admin api (which is mostly for statistics and a few commands). Either way, you can leverage pgcat live configuration reloading functionality to achive what you want. The only piece left is a simple service that handles the config changes, rewrite the config using some kind of templating system and signal pgcat to reload the configuration, that would be the way I'd do it. Also, bear in mind that you have to execute SET SHARD TO 'SHARD_NUMBER'; before each query/transaction for this to work.

nealoke commented 1 month ago

Thanks for the reply, we are working with Prisma so we can't really alter the SQL query in there so I'm affraid this won't be an option. If you have any other ideas that would be great, or other tools that could work as I'm not finding anything 🥲

magec commented 1 month ago

There is also a possibility, If you can 'annotate' with comments your query, pgcat can extract that information and do the sharding selection.