This PR implements the functionality necessary for a single IP-based HA Postgres instance. It accomplishes this through HAProxy and a VRRP VIP and some health-check scripts. Essentially, it does as follows (excerpt from @jhunt)
On bootstrap, if there is no data directory, the postgres job will
revert to a normal, index-based setup. The first node will assume
the role of the master, and the second will become a replica.
Once the data directory has been populated, future restarts of the
postgres job will attempt to contact the other node to see if it
is a master. If the other node responds, and reports itself as a
master, the local node will attempt a `pg_basebackup` from the
master and assume the role of a replica.
If the other node doesn't respond, or reports itself as a replica,
the local node will keep trying, for up to
`postgres.replication.grace` seconds, at which point it will
assume the mantle of leadership and become the master node,
using its current data directory as the canonical truth.
Each node then starts up a `monitor` process; this process is
responsible for ultimately promoting a local replica to be a
master, in the event that the real master goes offline. It works
like this:
1. Busy-loop (via 1-second sleeps) until the local postgres
instance is available on its configured port. This prevents
monitor from trying to restart the postgres while it is
running a replica `pg_basebackup`.
2. Busy-loop (again via 1-second sleeps) for as long as the
local postgres is a master.
3. Busy-loop (again via 1-second sleeps), checking the master
status of the other postgres node, until it detects that
either the master node has gone away (via a connection
timeout), or the master node has somehow become a replica.
4. Promote the local postgres node to a master.
This has been tested by hand in vSphere with CF's ccdb, uaa, and diegodb using postgres as it's storage medium. We noticed a x < 5 second outage during master failover. Release notes are already written.
Split-brain is almost impossible to prevent without a quorum solution, or some sort of binary star pattern. We are mostly trying to protect against outages during a deployment.
This PR implements the functionality necessary for a single IP-based HA Postgres instance. It accomplishes this through HAProxy and a VRRP VIP and some health-check scripts. Essentially, it does as follows (excerpt from @jhunt)
This has been tested by hand in vSphere with CF's ccdb, uaa, and diegodb using postgres as it's storage medium. We noticed a
x < 5
second outage during master failover. Release notes are already written.