Closed dennyabrain closed 4 months ago
With Postgres, each new connection can take upto 1.3MB in memory
One thing that works in our favour is that we need this high throughput only for one specific write operation. This is the operation to write incoming whatsapp messages into the database. So we don't have to worry about any update collision related issues.
Tradeoffs for client-side pooling (e.g. Ecto) vs. middleware (external) pooling (e.g. pgBouncer, pgPool)
My understanding is that using something like PgBouncer can help you in situations in which for some reason you have to create/remove database connections in a fast way or you want to have more database connections than your database instance actually is supposed to support.
One example that I came personally across was when I was using elixir together with kubernetes. During some of our operations, we would spawn many kubernetes pods at the same time, and all of them would try to get Database connections together. In those times, it would be easy to go over the number database connections that our Postgres instance actually supported.
- https://stackoverflow.com/questions/71352508/what-are-the-pros-and-cons-of-client-side-connect-pools-vs-external-connection-p
From my experience and understanding, the disadvantages of an external pool are:
Usually, a connection pool on the application side is a good thing for the reasons you detail. An external connection pool only makes sense if
Other relevant info
https://hexdocs.pm/ecto/replicas-and-dynamic-repositories.html
When applications reach a certain scale, a single database may not be enough to sustain the required throughput. In such scenarios, it is very common to introduce read replicas: all write operations are sent to the primary database and most of the read operations are performed against the replicas. The credentials of the primary and replica databases are typically known upfront by the time the code is compiled.
https://hexdocs.pm/ecto/Ecto.Repo.html
:pool_size - the size of the pool used by the connection module. Defaults to 10
Max number of connections postgres can support
You can often support more concurrent users by reducing the number of database connections and using some form of connection pooling
(See section) How to Find the Optimal Database Connection Pool Size
- https://stackoverflow.com/questions/30778015/how-to-increase-the-max-connections-in-postgres
Load testing postgres on EC2
Instructions on how to run pgbench for profiling insert operations - https://github.com/tattle-made/feluda/wiki/Optimization#testing-insert-performance
One of the bottlenecks in the infra is how many concurrent writes can our database support. One of the scenarios we want to be prepared for is getting ten lakh messages over an hour. We have strategies in place to scale (vertically or horizontally) our web server. That means these web servers would open connections to our Postgres instance. The client library for dealing with databases (Ecto) used in the web server has support for managing connection pools. Scope of this feature is to find out