hapostgres / pg_auto_failover

Postgres extension and service for automated failover and high-availability
Other
1.07k stars 112 forks source link

Failover for Citus doesn't support pg_dist_poolinfo (connection pooling) #973

Open mattklein opened 1 year ago

mattklein commented 1 year ago

We're testing out the newly added support in pg_auto_failover for a Citus formation. In our Citus formation, we use pg_dist_poolinfo alongside pg_dist_node so that the internal connections between Citus nodes are made through a connection pooler (pgbouncer). It does not appear that pg_auto_failover supports pg_dist_poolinfo though. In our testing, when pg_auto_failover triggers a failover, it makes the appropriate change to pg_dist_node, but not to pg_dist_poolinfo.

Before failover:

ourdb=# select * from pg_dist_node order by nodeid;
 nodeid | groupid |     nodename      | nodeport | noderack | hasmetadata | isactive | noderole | nodecluster | metadatasynced | shouldhaveshards
--------+---------+-------------------+----------+----------+-------------+----------+----------+-------------+----------------+------------------
      1 |       1 | citus-01-a.stg    |     5432 | default  | t           | t        | primary  | default     | t              | t
      2 |       2 | citus-02-a.stg    |     5432 | default  | t           | t        | primary  | default     | t              | t
      3 |       0 | citus-coord-a.stg |     5432 | default  | t           | t        | primary  | default     | t              | f
(3 rows)

ourdb=# select * from pg_dist_poolinfo order by nodeid;
 nodeid |                    poolinfo
--------+-------------------------------------------------
      1 | host=citus-01-a.stg port=6432 dbname=ourdb
      2 | host=citus-02-a.stg port=6432 dbname=ourdb
      3 | host=citus-coord-a.stg port=6432 dbname=ourdb
(3 rows)

After failover of citus-02-a to citus-02-b:


ourdb=# select * from pg_dist_node order by nodeid;
 nodeid | groupid |     nodename      | nodeport | noderack | hasmetadata | isactive | noderole | nodecluster | metadatasynced | shouldhaveshards
--------+---------+-------------------+----------+----------+-------------+----------+----------+-------------+----------------+------------------
      1 |       1 | citus-01-a.stg    |     5432 | default  | t           | t        | primary  | default     | t              | t
      2 |       2 | citus-02-b.stg    |     5432 | default  | t           | t        | primary  | default     | t              | t
      3 |       0 | citus-coord-a.stg |     5432 | default  | t           | t        | primary  | default     | t              | f
(3 rows)

ourdb=# select * from pg_dist_poolinfo order by nodeid;
 nodeid |                    poolinfo
--------+-------------------------------------------------
      1 | host=citus-01-a.stg port=6432 dbname=ourdb
      2 | host=citus-02-a.stg port=6432 dbname=ourdb
      3 | host=citus-coord-a.stg port=6432 dbname=ourdb
(3 rows)

So, after failover, citus-02-b has taken the place of citus-02-a in pg_dist_node, but the corresponding row in pg_dist_poolinfo still points at citus-02-a. This renders the cluster inoperable.