vitabaks / postgresql_cluster

PostgreSQL High-Availability Cluster (based on "Patroni" and DCS "etcd" or "consul"). Automating with Ansible.
MIT License
1.27k stars 340 forks source link

Patroni: Add permanent replication slots #614

Closed vitabaks closed 1 month ago

vitabaks commented 1 month ago

Add the ability to define permanent replication slots.

Example:

patroni_slots:
  - slot: "<logical_replication_slot_name>" # the name of the permanent replication slot.
    type: "logical" # the type of slot. Could be physical or logical (if the slot is logical, you have to define 'database' and 'plugin').
    plugin: "pgoutput" # the plugin name for the logical slot.
    database: "<my_db_name>" # the database name where logical slots should be created.

These slots will be preserved during switchover/failover. Permanent slots that don’t exist will be created by Patroni.

How it works:

With PostgreSQL 11 onwards permanent physical slots are created on all nodes and their position is advanced every loop_wait seconds. For PostgreSQL versions older than 11 permanent physical replication slots are maintained only on the current primary. The logical slots are copied from the primary to a standby with restart, and after that their position advanced every loop_wait seconds (if necessary). Copying logical slot files performed via libpq connection and using either rewind or superuser credentials.

There is always a chance that the logical slot position on the replica is a bit behind the former primary, therefore application should be prepared that some messages could be received the second time after the failover.

Permanent replication slots are synchronized only from the primary/standby_leader to replica nodes. That means, applications are supposed to be using them only from the leader node. Using them on replica nodes will cause indefinite growth of pg_wal on all other nodes in the cluster.

Doc: https://patroni.readthedocs.io/en/latest/dynamic_configuration.html

Additionally

vitabaks commented 1 month ago

Test

Vars:

patroni_slots:
  - slot: "logical_replication_slot"
    type: "logical"
    plugin: "pgoutput"
    database: "postgres"

Result:

--check slot on Primary
postgres=# select * from pg_replication_slots where slot_type = 'logical'\gx
-[ RECORD 1 ]-------+-------------------------
slot_name           | logical_replication_slot
plugin              | pgoutput
slot_type           | logical
datoid              | 5
database            | postgres
temporary           | f
active              | f
active_pid          | 
xmin                | 
catalog_xmin        | 733
restart_lsn         | 0/172B808
confirmed_flush_lsn | 0/172B840
wal_status          | reserved
safe_wal_size       | 
two_phase           | f
conflicting         | f

--check slot on Replica
postgres=# select * from pg_replication_slots where slot_type = 'logical'\gx
-[ RECORD 1 ]-------+-------------------------
slot_name           | logical_replication_slot
plugin              | pgoutput
slot_type           | logical
datoid              | 5
database            | postgres
temporary           | f
active              | f
active_pid          | 
xmin                | 
catalog_xmin        | 733
restart_lsn         | 0/172B808
confirmed_flush_lsn | 0/172B840
wal_status          | reserved
safe_wal_size       | 
two_phase           | f
conflicting         | f

patronictl show-config

loop_wait: 10
master_start_timeout: 300
maximum_lag_on_failover: 1048576
postgresql:
  parameters:
    ...
  use_pg_rewind: true
  use_slots: true
retry_timeout: 10
slots:
  logical_replication_slot:
    database: postgres
    plugin: pgoutput
    type: logical
synchronous_mode: false
synchronous_mode_strict: false
synchronous_node_count: 1
ttl: 30

passed