pgEdge / spock

Logical Multi-master Replication
https://github.com/pgedge/pgedge
Other
163 stars 14 forks source link

Seamless traffic diversion? #13

Closed nkev closed 10 months ago

nkev commented 11 months ago

In your guide Installing Distributed PostgreSQL (Nov 16, 2023), the section High Availability and Failover near the bottom, the second sentence says:

If one or more nodes become unavailable, traffic seamlessly shifts to the remaining active nodes, eliminating any delays or lost transactions associated with passive nodes deliberating to elect a new active node.

Spock is a Postgres extension, which means if Postgres goes down, pgEdge goes down. How is the traffic "seamlessly shifted to healthy nodes"?

My application talks directly to Postgres and as far as I know, there is no interception by pgEdge that could divert the traffic so how can this claim be true?

luss commented 10 months ago

This is an excellent question. I'm gonna have Ibrar address and point you to some of his detailed writings on this topic.

ibrarahmad commented 10 months ago

In the pgEdge Ultra-HA solution, each active node within a region is equipped with two read-replicas. For instance, if there are two active nodes—one in each region—each of these nodes will be associated with two read-replicas. Streamlining replication establishes the connection between these read-replicas and the primary or write/active node. If one of the write/active nodes encounters a failure, a read-replica seamlessly transitions to become the new active or primary node. This transition ensures uninterrupted transaction processing. Within the region, the entire setup utilizes Patroni for failover management, ETCD for selecting the active node for the quorum, and seamless traffic diversion to the primary active node is achieved using HAProxy.

For more detailed insights into how this works and the technologies employed, you can refer to the blog.

nkev commented 10 months ago

Thanks for the clarification. The link I posted does not mention the use of third-party products like Patroni or HAProxy.

If one or more nodes become unavailable, traffic seamlessly shifts to the remaining active nodes, eliminating any delays or lost transactions associated with passive nodes deliberating to elect a new active node.

The wording makes one feel it's all done internally. It might be a good idea to mention that this solution is only available in PGEdge Cloud and PGEdge Platform users would need to set up their own Patroni/HAProxy solution.

luss commented 10 months ago

We certainly are helping and supporting Platform users with an integrated solution that includes our customized version of Patroni