EnterpriseDB / barman

Barman - Backup and Recovery Manager for PostgreSQL
https://www.pgbarman.org/
GNU General Public License v3.0
2.06k stars 191 forks source link

archiver errors: FAILED (duplicates: 8) #897

Open franxav06 opened 7 months ago

franxav06 commented 7 months ago

Hello I have the a problem of duplicated wal files on a server on a replicated database barman backup server Server xxxxxxxxxx02b: PostgreSQL: OK superuser or standard user with backup privileges: OK PostgreSQL streaming: OK wal_level: OK replication slot: OK directories: OK retention policy settings: OK backup maximum age: OK (no last_backup_maximum_age provided) backup minimum size: OK (1.9 GiB) wal maximum age: OK (no last_wal_maximum_age provided) wal size: OK (0 B) compression settings: OK failed backups: OK (there are 0 failed backups) minimum redundancy requirements: OK (have 7 backups, expected at least 7) ssh: OK (PostgreSQL server) systemid coherence: OK pg_receivexlog: OK pg_receivexlog compatible: OK receive-wal running: OK archive_mode: OK archive_command: OK continuous archiving: OK archiver errors: FAILED (duplicates: 8)

Here under the traces from the log

2024-02-07 03:15:23,531 [911882] barman.server INFO: Received file '00000001000000030000006D' with checksum '925b30ed0a852651fe5d418cd4df84e6' by put-wal for server 'XXXXXXXXX02b' (SSH host: XX.XXX.XXX.XX)
2024-02-07 03:15:26,216 [911869] barman.copy_controller INFO: Copy step 3 of 5: [bucket 0] finished (duration: 22 seconds) copy safe files from remote PGDATA directory: /var/lib/pgsql/15/data/
2024-02-07 03:15:26,243 [911869] barman.copy_controller INFO: Copy step 4 of 5: [bucket 0] starting copy files with checksum from remote PGDATA directory: /var/lib/pgsql/15/data/
2024-02-07 03:15:26,569 [911869] barman.copy_controller INFO: Copy step 4 of 5: [bucket 0] finished (duration: less than one second) copy files with checksum from remote PGDATA directory: /var/lib/pgsql/15/data/
2024-02-07 03:15:26,570 [911869] barman.copy_controller INFO: Copy step 5 of 5: [global] starting copy remote pg_control file: /var/lib/pgsql/15/data/global/pg_control
2024-02-07 03:15:26,722 [911869] barman.copy_controller INFO: Copy step 5 of 5: [global] finished (duration: less than one second) copy remote pg_control file: /var/lib/pgsql/15/data/global/pg_control
2024-02-07 03:15:26,723 [911849] barman.copy_controller INFO: Copy finished (safe before 2024-02-06 03:15:03.221710+01:00)
2024-02-07 03:15:26,743 [911849] barman.backup_executor INFO: Copy done (time: 22 seconds)
2024-02-07 03:15:26,745 [911849] barman.backup_executor INFO: Asking PostgreSQL server to finalize the backup.
2024-02-07 03:15:29,252 [911849] barman.backup INFO: Backup size: 1.9 GiB. Actual size on disk: 1.6 GiB (-14.40% deduplication ratio).
2024-02-07 03:15:29,253 [911849] barman.backup INFO: Backup end at LSN: 3/6D0000D8 (00000001000000030000006D, 000000D8)
2024-02-07 03:15:29,253 [911849] barman.backup INFO: Backup completed (start time: 2024-02-07 03:15:02.725770, elapsed time: 26 seconds)
2024-02-07 03:15:29,253 [911849] barman.server INFO: Waiting for the WAL file 00000001000000030000006D from server 'lavdcpgsql02b'
2024-02-07 03:15:29,255 [911849] barman.wal_archiver INFO: Found 2 xlog segments from streaming for lavdcpgsql02b. Archive all segments in one run.
2024-02-07 03:15:29,255 [911849] barman.wal_archiver INFO: Archiving segment 1 of 2 from streaming: lavdcpgsql02b/00000001000000030000006C
2024-02-07 03:15:29,931 [911849] barman.wal_archiver INFO: Archiving segment 2 of 2 from streaming: lavdcpgsql02b/00000001000000030000006D
2024-02-07 03:15:30,054 [911849] barman.wal_archiver INFO: Found 2 xlog segments from file archival for lavdcpgsql02b. Archive all segments in one run.
2024-02-07 03:15:30,054 [911849] barman.wal_archiver INFO: Archiving segment 1 of 2 from file archival: lavdcpgsql02b/00000001000000030000006C
2024-02-07 03:15:30,183 [911849] barman.wal_archiver INFO:      Error: 00000001000000030000006C is already present in server lavdcpgsql02b. File moved to errors directory.
2024-02-07 03:15:30,183 [911849] barman.wal_archiver INFO: Archiving segment 2 of 2 from file archival: lavdcpgsql02b/00000001000000030000006D
2024-02-07 03:15:30,274 [911849] barman.wal_archiver INFO:      Error: 00000001000000030000006D is already present in server lavdcpgsql02b. File moved to errors directory.

And here the configuration for WALs in postgresql.conf

`#------------------------------------------------------------------------------

WRITE-AHEAD LOG

------------------------------------------------------------------------------

- Settings -

wal_level = replica # minimal, replica, or logical

(change requires restart)

fsync = on # flush data to disk for crash safety

(turning this off can cause

                                    # unrecoverable data corruption)

synchronous_commit = on # synchronization level;

                                    # off, local, remote_write, remote_apply, or on

wal_sync_method = fsync # the default is the first option

                                    # supported by the operating system:
                                    #   open_datasync
                                    #   fdatasync (default on Linux and FreeBSD)
                                    #   fsync
                                    #   fsync_writethrough
                                    #   open_sync

full_page_writes = on # recover from partial page writes

wal_log_hints = on # also do full page writes of non-critical updates

(change requires restart)

wal_compression = off # enables compression of full-page writes;

                                    # off, pglz, lz4, zstd, or on

wal_init_zero = on # zero-fill new WAL files

wal_recycle = on # recycle WAL files

wal_buffers = -1 # min 32kB, -1 sets based on shared_buffers

                                    # (change requires restart)

wal_writer_delay = 200ms # 1-10000 milliseconds

wal_writer_flush_after = 1MB # measured in pages, 0 disables

wal_skip_threshold = 2MB

commit_delay = 0 # range 0-100000, in microseconds

commit_siblings = 5 # range 1-1000

- Checkpoints -

checkpoint_timeout = 5min # range 30s-1d

checkpoint_completion_target = 0.9 # checkpoint target duration, 0.0 - 1.0

checkpoint_flush_after = 256kB # measured in pages, 0 disables

checkpoint_warning = 30s # 0 disables

max_wal_size = 1GB min_wal_size = 80MB

- Prefetching during recovery -

recovery_prefetch = try # prefetch pages referenced in the WAL?

wal_decode_buffer_size = 512kB # lookahead window used for prefetching

                                    # (change requires restart)

- Archiving -

archive_mode = on # enables archiving; off, on, or always

(change requires restart)

archive_library = '' # library to use to archive a logfile segment

archive_command = 'barman-wal-archive xxxxxbarman01a xxxxxpgsql02b %p' # command to use to archive a logfile segment

placeholders: %p = path of file to archive

                            #               %f = file name only

` Thanks in advance for your help

Francesco