EnterpriseDB / barman

Barman - Backup and Recovery Manager for PostgreSQL
https://www.pgbarman.org/
GNU General Public License v3.0
2.14k stars 193 forks source link

Issue Replication slot error for standby #1032

Closed adnanhamdussalam closed 14 hours ago

adnanhamdussalam commented 2 weeks ago

Hi,

Barman Server: 10.114.16.34 Primary Server: 10.114.16.68 Standby Server: 10.114.16.70

[barman@testbed05 conf.d]$ barman -v 3.11.1 Barman by EnterpriseDB (www.enterprisedb.com)

I want to configure the backup from standby instead of primary side: PFB the my configuration :

[barman@testbed05 conf.d]$ cat pg.conf ; Barman, Backup and Recovery Manager for PostgreSQL ; https://www.pgbarman.org/ - https://www.enterprisedb.com/ ; ; Template configuration file for a server using ; only streaming replication protocol ;

[pg] ; Human readable description description = "Example of PostgreSQL Database (Streaming-Only)" ; ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ; PostgreSQL connection string (mandatory) ; ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; conninfo = host=10.114.16.70 user=barman dbname=mydb

ssh_command = ssh postgres@10.114.16.68 -q

backup_options = concurrent_backup

; ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ; PostgreSQL streaming connection string ; ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ; To be used by pg_basebackup for backup and pg_receivewal for WAL streaming ; NOTE: streaming_barman is a regular user with REPLICATION privilege streaming_conninfo = host=10.114.16.68 user=streaming_barman ; ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ; Backup settings (via pg_basebackup) ; ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; backup_method = postgres streaming_backup_name = barman_streaming_backup

; ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ; WAL streaming settings (via pg_receivewal) ; ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; streaming_archiver = on slot_name = barman create_slot = auto streaming_archiver_name = barman_receive_wal streaming_archiver_batch_size = 50

; Uncomment the following line if you are also using archive_command ; otherwise the "empty incoming directory" check will fail ;archiver = on

; PATH setting for this server path_prefix = "/usr/pgsql-16/bin" primary_conninfo = host=10.114.16.68 user=barman dbname=mydb

But barman check command is giving me error that replication slot must be on standby server. Any idea ?

PFB the error :

[barman@testbed05 conf.d]$ barman check pg Server pg: PostgreSQL: OK superuser or standard user with backup privileges: OK PostgreSQL streaming: OK wal_level: OK PostgreSQL server is standby: OK Primary server is not a standby: OK Primary and standby have same system ID: OK replication slot: FAILED (slot 'barman' not initialised: is 'receive-wal' running?) directories: OK retention policy settings: OK backup maximum age: OK (no last_backup_maximum_age provided) backup minimum size: OK (3.4 GiB) wal maximum age: OK (no last_wal_maximum_age provided) wal size: OK (0 B) compression settings: OK failed backups: OK (there are 0 failed backups) minimum redundancy requirements: OK (have 4 backups, expected at least 0) pg_basebackup: OK pg_basebackup compatible: OK pg_basebackup supports tablespaces mapping: OK systemid coherence: OK pg_receivexlog: OK pg_receivexlog compatible: OK receive-wal running: OK archiver errors: OK

martinmarques commented 2 weeks ago

streaming_conninfo is pointing to the primary, that's what Barman uses for backups.

adnanhamdussalam commented 2 weeks ago

Thank you for the update. As I am newbie to barman. Now it is working but I have to drop the replication slot barman from primary and created manually on standby then the barman check pg shows all OK. It means that we have to create the replication slots manually whenever the primary switches to standby. For barman the replication slot should reside on standby side ?

PFB the log:

[barman@testbed05 conf.d]$ barman check pg Server pg: PostgreSQL: OK superuser or standard user with backup privileges: OK PostgreSQL streaming: OK wal_level: OK PostgreSQL server is standby: OK Primary server is not a standby: OK Primary and standby have same system ID: OK replication slot: OK directories: OK retention policy settings: OK backup maximum age: OK (no last_backup_maximum_age provided) backup minimum size: OK (3.4 GiB) wal maximum age: OK (no last_wal_maximum_age provided) wal size: OK (0 B) compression settings: OK failed backups: OK (there are 0 failed backups) minimum redundancy requirements: OK (have 8 backups, expected at least 0) pg_basebackup: OK pg_basebackup compatible: OK pg_basebackup supports tablespaces mapping: OK systemid coherence: OK pg_receivexlog: OK pg_receivexlog compatible: OK receive-wal running: OK archiver errors: OK [barman@testbed05 conf.d]$ barman backup pg --wait Starting backup using postgres method for server pg in /backup/barman/pg/base/20241105T121450 Backup start at LSN: 2/D7003348 Starting backup copy via pg_basebackup for 20241105T121450 Copy done (time: 11 seconds) Finalising the backup. Backup size: 3.4 GiB Backup end at LSN: 2/D7003348 (0000002400000002000000D7, 00003348) Backup completed (start time: 2024-11-05 12:14:50.304951, elapsed time: 11 seconds) Waiting for the WAL file 0000002400000002000000D7 from server 'pg' Processing xlog segments from streaming for pg 0000002400000002000000D7 [barman@testbed05 conf.d]$

martinmarques commented 2 weeks ago

Thank you for the update. As I am newbie to barman. Now it is working but I have to drop the replication slot barman from primary and created manually on standby then the barman check pg shows all OK. It means that we have to create the replication slots manually whenever the primary switches to standby. For barman the replication slot should reside on standby side ?

No, but there's a bug that needs fixing which might be related to what you are seeing.

https://github.com/EnterpriseDB/barman/issues/1024

adnanhamdussalam commented 2 weeks ago

Thank you for the update. Currently I have to change the sync mode from async to sync but it not getting changed at the database level but at OS level barman process it showing the sync mode. is it a bug or am I doing a configuration wrong?

PFB the output:

mydb=# select * from pg_stat_replication; pid | usesysid | usename | application_name | client_addr | client_hostname | client_port | backend_start | backend_xmin | state | sent_lsn | write_ls n | flush_lsn | replay_lsn | write_lag | flush_lag | replay_lag | sync_priority | sync_state | reply_time ---------+----------+------------------+--------------------+--------------+-----------------+-------------+-------------------------------+--------------+-----------+------------+--------- ---+------------+------------+-----------------+-----------------+-----------------+---------------+------------+------------------------------- 1323211 | 33644 | streaming_barman | barman_receive_wal | 10.114.16.34 | | 60170 | 2024-11-06 07:33:02.200294-05 | | streaming | 3/9C000148 | 3/9C0001 48 | 3/9C000148 | | 00:00:00.001916 | 00:00:00.001916 | 00:14:57.917293 | 0 | async | 2024-11-06 07:48:00.134036-05 (1 row)

mydb=# show synchronous_standby_names; synchronous_standby_names

barman_receive_wal (1 row)

[barman@testbed05 wals]$ ps -ef | grep barman_receive_wal barman 1226060 1226053 0 07:33 ? 00:00:00 /usr/pgsql-16/bin/pg_receivewal --dbname=dbname=replication host=10.114.16.69 options=-cdatestyle=iso replication=true user=streaming_barman application_name=barman_receive_wal --verbose --no-loop --no-password --directory=/backup/barman/worker/streaming --slot=barman --synchronous

[barman@testbed05 wals]$ barman replication-status worker Status of streaming clients for server 'worker': Current LSN on master: 3/9C000060 Number of streaming clients: 1

  1. Async WAL streamer Application name: barman_receive_wal Sync stage : 3/3 Remote write Communication : TCP/IP IP Address : 10.114.16.34 / Port: 60170 / Host: - User name : streaming_barman Current state : streaming (async) Replication slot: barman WAL sender PID : 1323211 Started at : 2024-11-06 07:33:02.200294-05:00 Sent LSN : 3/9C000060 (diff: 0 B) Write LSN : 3/9C000060 (diff: 0 B) Flush LSN : 3/9C000060 (diff: 0 B)
martinmarques commented 14 hours ago

@adnanhamdussalam I'm closing this ticket as the original issue got solved. If you have more questions, open a new one, or even better, use the Barman google groups where there are other experts that can help you.

https://groups.google.com/g/pgbarman