Question about failover triggers

joe-at-startupmedia commented 4 months ago

In scenarios where you have pgbouncer between haproxy and patroni

When pgbouncer goes down, haproxy still responds to and/or routes write requests to the primary node answered by patroni. The problem is that since the pgbouncer service can be down on the same node shared with the patroni master, there will be no way to perform the write. As a result, the node must handle this by forcing patroni to failover if pgbouncer is down. How is that handled? The alternative would be to use an existing pgbouncer on another node to handle the write remotely but this complicates configuration considerably.

If haproxy goes down it's different because either the virtual IP or consul routes to a node where haproxy is alive to complete the local pgbouncer-> patroni request chain.

joe-at-startupmedia commented 4 months ago

I'm currently using scenario A but using consul instead of etcd to omit the need for keepalived & virtual IPs.

Read/Write Request Chain.

SQL write -> primary-haproxy.service.consul (dns load balanced) -> haproxy:5000 -> patroni_master_pgbouncer node -> local patroni master

SQL read -> replica-haproxy.service.consul (dns load balanced) -> haproxy:5001 -> patroni_replica_pgbouncer node (roundrobin) -> local patroni replica

Possible failure scenarios

Haproxy fails: consul DNS will omit the failing node A record from the response resulting in read/write requests only being routed to nodes where haproxy is healthy.
Patroni fails: patroni will automatically failover and elect a new leader. This will in turn dynamically reload the haproxy configurations to only route write requests to the newly elected primary node. pooled connections of former primary nodes will be terminated since the connection between pgbouncer and patroni would have to be re-established.
Pgbouncer fails: Currently this doesn't seem to be handled. Patroni failover must be triggered if the node is the primary.

vitabaks commented 4 months ago

When pgbouncer goes down, haproxy still responds to and/or routes write requests to the primary node answered by patroni. The problem is that since the pgbouncer service can be down on the same node shared with the patroni master, there will be no way to perform the write.

Why might the pgbouncer service be unavailable? I have not had such a case in 5 years of cluster operation.

As a result, the node must handle this by forcing patroni to failover if pgbouncer is down. How is that handled? The alternative would be to use an existing pgbouncer on another node to handle the write remotely but this complicates configuration considerably.

While for haproxy ("Type A" scheme), high availability is provided by the keepalived service, for the pgbouncer service, high availability is not provided by anything (in the "Type A" and "Type B" schemas), we just rely on its reliability... I can recommend deploying multiple pgbouncer services (use the so_reuseport) to minimize the risk. See the pgbouncer_processes variable.

Checks of the pgbouncer service are currently only implemented in the scheme using Consul ("Type C" scheme), see the details of the checks here

joe-at-startupmedia commented 4 months ago

I agree that the possibility of the pgbouncer service going down is statistically insignificant. I've also had patroni cluster running for years at a time without any issues with the pgbouncer service. I'm also an unlucky person and accounting for every point of failure helps me sleep better at night. While deploying multiple pgbouncer processes could also remedy this, that wouldn't be the case in the event of a configuration/firewall issue where all pgbouncer services would go down. In either case what's your opinion on the following solution? It's likely beyond the scope of this project but it resolves the issue in my testing. https://github.com/joe-at-startupmedia/haproxy-agent-checker-patroni-pgbouncer

vitabaks / postgresql_cluster