Closed joe-at-startupmedia closed 4 months ago
I'm currently using scenario A but using consul instead of etcd to omit the need for keepalived & virtual IPs.
SQL write -> primary-haproxy.service.consul (dns load balanced) -> haproxy:5000 -> patroni_master_pgbouncer node -> local patroni master
SQL read -> replica-haproxy.service.consul (dns load balanced) -> haproxy:5001 -> patroni_replica_pgbouncer node (roundrobin) -> local patroni replica
Haproxy fails: consul DNS will omit the failing node A record from the response resulting in read/write requests only being routed to nodes where haproxy is healthy.
Patroni fails: patroni will automatically failover and elect a new leader. This will in turn dynamically reload the haproxy configurations to only route write requests to the newly elected primary node. pooled connections of former primary nodes will be terminated since the connection between pgbouncer and patroni would have to be re-established.
Pgbouncer fails: Currently this doesn't seem to be handled. Patroni failover must be triggered if the node is the primary.
When pgbouncer goes down, haproxy still responds to and/or routes write requests to the primary node answered by patroni. The problem is that since the pgbouncer service can be down on the same node shared with the patroni master, there will be no way to perform the write.
Why might the pgbouncer service be unavailable? I have not had such a case in 5 years of cluster operation.
As a result, the node must handle this by forcing patroni to failover if pgbouncer is down. How is that handled? The alternative would be to use an existing pgbouncer on another node to handle the write remotely but this complicates configuration considerably.
While for haproxy
("Type A" scheme), high availability is provided by the keepalived
service, for the pgbouncer service, high availability is not provided by anything (in the "Type A" and "Type B" schemas), we just rely on its reliability...
I can recommend deploying multiple pgbouncer services (use the so_reuseport) to minimize the risk. See the pgbouncer_processes variable.
Checks of the pgbouncer service are currently only implemented in the scheme using Consul ("Type C" scheme), see the details of the checks here
I agree that the possibility of the pgbouncer service going down is statistically insignificant. I've also had patroni cluster running for years at a time without any issues with the pgbouncer service. I'm also an unlucky person and accounting for every point of failure helps me sleep better at night. While deploying multiple pgbouncer processes could also remedy this, that wouldn't be the case in the event of a configuration/firewall issue where all pgbouncer services would go down. In either case what's your opinion on the following solution? It's likely beyond the scope of this project but it resolves the issue in my testing. https://github.com/joe-at-startupmedia/haproxy-agent-checker-patroni-pgbouncer
In scenarios where you have pgbouncer between haproxy and patroni
When pgbouncer goes down, haproxy still responds to and/or routes write requests to the primary node answered by patroni. The problem is that since the pgbouncer service can be down on the same node shared with the patroni master, there will be no way to perform the write. As a result, the node must handle this by forcing patroni to failover if pgbouncer is down. How is that handled? The alternative would be to use an existing pgbouncer on another node to handle the write remotely but this complicates configuration considerably.
If haproxy goes down it's different because either the virtual IP or consul routes to a node where haproxy is alive to complete the local pgbouncer-> patroni request chain.