hapostgres / pg_auto_failover

Postgres extension and service for automated failover and high-availability
Other
1.12k stars 115 forks source link

Work around pg_replication_slot_advance xmin maintenance bug. #815

Closed DimCitus closed 3 years ago

DimCitus commented 3 years ago

When the Postgres function pg_replication_slot_advance is called on a slot that used to be active (using streaming replication) and is now inactive (maintained manually) then the xmin of the slot is not maintained anymore.

To avoid situations where Postgres won't VACUUM dead rows and otherwise maintains itself correctly, when a former primary node is available again and joind a pg_auto_failover as a standby, we drop the pre-existing replication slot and create new ones (where this time xmin is NULL).

Fixes #814

JelteF commented 3 years ago

For reference this is the bugreport to postgres about this: https://www.postgresql.org/message-id/flat/VI1PR83MB01897A49AE5C54A82F841D8999B09%40VI1PR83MB0189.EURPRD83.prod.outlook.com