Seeing a lot of busy_dist_port messages due to TCP connections between nodes being saturated or otherwise experiencing some packet loss or head-of-line blocking.
We should also reduce the number of hops to the database to reduce latency. -Just- need to share or acquire a global count of database connections.
Poll pg_stat_activity and cache Supavisor connection count on an interval
max pool size per node should be a function Node.list + 1 / sum of application_name filtered by pool_pid
application_name of DbHandlers should be something like Supavisor Pool - #{pool_pid}
Expire cache maybe 5 seconds?
Option 1
[ ] "primary" node with manager
[ ] need query rate per connection in manager
[ ] manager should be able to dynamically change pool size and calibrate pool size on each node
[ ] nodes with busier connections should have relatively more conns to the database
Seeing a lot of
busy_dist_port
messages due to TCP connections between nodes being saturated or otherwise experiencing some packet loss or head-of-line blocking.We should also reduce the number of hops to the database to reduce latency. -Just- need to share or acquire a global count of database connections.
Supavisor Pool - #{pool_pid}
Option 1