EnterpriseDB / barman

Barman - Backup and Recovery Manager for PostgreSQL
https://www.pgbarman.org/
GNU General Public License v3.0
2.06k stars 191 forks source link

rsync error during backup or recover #883

Closed dblugeon closed 8 months ago

dblugeon commented 8 months ago

Hello enterpriseDB Team,

On a fresh server (rocky linux 9.3), I install barman 3.9 (from pgdg repo). When I backup a postgresql 13 server (rocky linux 9.3 too), I have these warning and error: 2024-01-10 15:40:30,584 [24439] barman.copy_controller INFO: Detected rsync version less than 3.1. top using '--ignore-missing-args' argument. 2024-01-10 15:40:30,592 [24439] RsyncPgData WARNING: Overflow in read_varint() 2024-01-10 15:40:30,592 [24439] RsyncPgData WARNING: rsync error: error in rsync protocol data stream (code 12) at io.c(1754) [sender=3.2.3] 2024-01-10 15:40:30,592 [24439] RsyncPgData WARNING: rsync error: received SIGUSR1 (code 19) at main.c(1599) [Receiver=3.2.3] 2024-01-10 15:40:30,692 [24439] barman.copy_controller ERROR: Unable to retrieve reference directory file list. Using only source file information to decide which files need to be copied with checksums enabled: {'ret': 12, 'out': '', 'err': 'Overflow in read_varint()\nrsync error: error in rsync protocol data stream (code 12) at io.c(1754) [sender=3.2.3]\nrsync error: received SIGUSR1 (code 19) at main.c(1599) [Receiver=3.2.3]\n'} 2024-01-10 15:40:30,918 [24439] barman.copy_controller INFO: Copy step 2 of 13: [global] create destination directories and delete unknown files for remote tablespace directory 'db_toto_data': /home/san1/pg_tbs/toto/toto_data/ 2024-01-10 15:40:31,143 [24439] barman.copy_controller INFO: Copy step 3 of 13: [global] analyze remote tablespace directory 'db_toto_index': /home/san1/pg_tbs/toto/toto_index/ 2024-01-10 15:40:31,149 [24439] RsyncPgData WARNING: Overflow in read_varint() 2024-01-10 15:40:31,149 [24439] RsyncPgData WARNING: rsync error: error in rsync protocol data stream (code 12) at io.c(1754) [sender=3.2.3] 2024-01-10 15:40:31,149 [24439] RsyncPgData WARNING: rsync error: received SIGUSR1 (code 19) at main.c(1599) [Receiver=3.2.3] 2024-01-10 15:40:31,250 [24439] barman.copy_controller ERROR: Unable to retrieve reference directory file list. Using only source file information to decide which files need to be copied with checksums enabled: {'ret': 12, 'out': '', 'err': 'Overflow in read_varint()\nrsync error: error in rsync protocol data stream (code 12) at io.c(1754) [sender=3.2.3]\nrsync error: received SIGUSR1 (code 19) at main.c(1599) [Receiver=3.2.3]\n'} I don't understand 3 points:

  1. why barman detects rsync < 3.1 when rocky linux ships rsync 3.2.3 ?
  2. warnings about rsync error
  3. the barman error about directory file list and his impact for backup or recovery.

I also check that I can backup or recover without error (exit code 0). Despite these errors, I was able to restore a physical backup and make pg_dump then pg_restore without problem.

os version: rocky linux 9.3 barman version: 3.9.0-1PGDG.rhel9 rsync (from rocky) version: 3.2.3 python version: 3.9.18

Regards

gcalacoci commented 8 months ago

Hi, Could you run the barman diagnose and post the output here, after removing any sensitive information from the output?

about your question number 3. is this the first backup?

dblugeon commented 8 months ago

ello @gcalacoci, Thank you for your response. Here the diagnose result on a barman server on rocky 9.3: diagnose_rocky8.json

about your question number 3. is this the first backup? It was the first backup, but I have same log lines with following backups.

I make an another test : install barman 3.9 on a rocky linux 8 which makes backup of postgresql instance under rocky linux 8 too, I don't have any warning/errors about rsync. I uploaded the diagnose result for this test : diagnose_rocky9.json

Regards

dblugeon commented 8 months ago

Hello, I make another tests and actually, I can't reproduce the problem. I isolate that my first postgresql vm has only affected by this problem. I schedule another test the next week, depending his result, I will close this issue.

Regards

dblugeon commented 8 months ago

Hello @gcalacoci, I found what is the problem. It wasn't a problem from barman but from rsync.

Rocky/RHEL/xxx 9 has rsync in 3.2.3 version and this version is affected by WayneD/rsync#84.

You can have theses warnings if:

To fix these warning:

I check the debug log during a backup. Barman use this options in this first attempt

rsync .... -e "ssh 'postgres@PG_VM.acme' '-o' 'BatchMode=yes' '"... --ignore-missing-args --itemize-changes --itemize-changes --no-human-readable --list-only -r /path/PG_VM/base/20240115T103614/16404/ It's failed with rsync 3.2.3 and barman fail back with rsync .... -e "ssh 'postgres@PG_VM.acme' '-o' 'BatchMode=yes' '"... --itemize-changes --itemize-changes --no-human-readable --list-only -r /path/PG_VM/base/20240115T103614/16404/ then in the last attempt rsync ... --itemize-changes --itemize-changes --no-human-readable --list-only -r :/path/PG_VM/base/20240115T103614/16404/

So these warnings are not real problem, thanks to these barman's fall-back.

Thank you for your time and sorry for the disturbance.