vitabaks / postgresql_cluster

PostgreSQL High-Availability Cluster (based on Patroni). Automating with Ansible.
https://postgresql-cluster.org
MIT License
1.81k stars 414 forks source link

Suggestion: Haproxy "Local Replica" pool #260

Closed rez0n closed 1 year ago

rez0n commented 1 year ago

Hi, I have cluster running on 3 nodes in different datacenters, application instances also running there, latency between them is not high (5ms/8ms) but it is not 0.xx ms, so huge queries from server2 to server1 running slower than locally. I started using :5000 to reach replica for read-only requests, but as it balance scheme is round-robin it is randomly routes you to local replica and remote ones.

To solve this, I added one more HAProxy section to prioritize local replica (self) rather random replica, meanwhile if it will down remotes will be used as backup. Here is my config as feature suggestion, unfortunately I have no time to setup test environment to play with pull-request to submit it. Server1: 10.10.10.1 Server2: 10.10.10.2 Server3: 10.10.10.3

# /etc/confd/templates/haproxy.tmpl
listen replica_local
    bind 10.10.10.1:5004
    maxconn 10000
    option tcplog
    option httpchk OPTIONS /read-only
    balance roundrobin
    http-check expect status 200
    default-server inter 3s fastinter 1s fall 3 rise 2 on-marked-down shutdown-sessions
 server pgnode1 10.10.10.1:6432 check port 8008
 server pgnode2 10.10.10.2:6432 check port 8008 backup
 server pgnode3 10.10.10.3:6432 check port 8008 backup

And same for each other node, without 'backup' in server pgnode... line with node ip. Anyway this post may help someone to solve same case.

vitabaks commented 1 year ago

Thanks for the comment. I must say that this configuration is not suitable for everyone because in most cases it is necessary to balance the load across all replicas. But this does not mean that you cannot change the configuration of Haproxy to suit your needs, perhaps your example will be useful to someone else.

For a cluster located in different data centers, I would recommend using the "Type C" scheme.

Besides, it can be useful for a distributed cluster across different data centers. We can specify in advance which data center the database server is located in and then use this for applications running in the same data center. Example: replica.postgres-cluster.service.dc1.consul, replica.postgres-cluster.service.dc2.consul