ossc-db / pg_rman

Backup and restore management tool for PostgreSQL
http://ossc-db.github.io/pg_rman/index.html
Other
476 stars 77 forks source link

Issue when trying to delete arch's from ARCLOG_PATH #265

Open TeodorChakalov opened 8 months ago

TeodorChakalov commented 8 months ago

Hello,

OpenSSL version: OpenSSL 3.0.7 1 Nov 2022 (Library: OpenSSL 3.0.7 1 Nov 2022)

OS version: NAME="Rocky Linux" VERSION="9.2 (Blue Onyx)"

Postgresql version: v14.4

I set the ARCLOG_PATH to /opt/pdbelsng/backups/arch:

pdbelsng_adm@kde0210-bos-t01.0210.de.kaufland fullbackup$ cat pg_rman.ini ARCLOG_PATH = /opt/pdbelsng/backups/arch SRVLOG_PATH = /opt/pdbelsng/data/pg14/pg_log

BACKUP_MODE = F COMPRESS_DATA = YES KEEP_ARCLOG_FILES = 10 KEEP_ARCLOG_DAYS = -1

So when I go to the ARCLOG_PATH directory:

cd /opt/pdbelsng/backups/arch pdbelsng_adm@kde0210-bos-t01.0210.de.kaufland arch$ ls -la total 2156 drwx------. 2 pdbelsng_adm pdbelsng_adm 4096 Mar 21 09:42 . drwx------. 4 pdbelsng_adm pdbelsng_adm 4096 Mar 21 09:17 .. -rw-------. 1 pdbelsng_adm pdbelsng_adm 1815846 Mar 21 09:33 000000010000000000000001.gz -rw-------. 1 pdbelsng_adm pdbelsng_adm 200 Mar 21 09:33 000000010000000000000002.00000060.backup.gz -rw-------. 1 pdbelsng_adm pdbelsng_adm 16476 Mar 21 09:33 000000010000000000000002.gz -rw-------. 1 pdbelsng_adm pdbelsng_adm 18420 Mar 21 09:33 000000010000000000000003.gz -rw-------. 1 pdbelsng_adm pdbelsng_adm 16387 Mar 21 09:33 000000010000000000000004.partial.gz -rw-------. 1 pdbelsng_adm pdbelsng_adm 16463 Mar 21 09:33 000000020000000000000004.gz -rw-------. 1 pdbelsng_adm pdbelsng_adm 200 Mar 21 09:33 000000020000000000000005.00000060.backup.gz -rw-------. 1 pdbelsng_adm pdbelsng_adm 16478 Mar 21 09:33 000000020000000000000005.gz -rw-------. 1 pdbelsng_adm pdbelsng_adm 16935 Mar 21 09:33 000000020000000000000006.gz -rw-------. 1 pdbelsng_adm pdbelsng_adm 16420 Mar 21 09:33 000000020000000000000007.gz -rw-------. 1 pdbelsng_adm pdbelsng_adm 16389 Mar 21 09:33 000000020000000000000008.partial.gz -rw-------. 1 pdbelsng_adm pdbelsng_adm 59 Mar 21 09:33 00000002.history.gz -rw-------. 1 pdbelsng_adm pdbelsng_adm 16464 Mar 21 09:33 000000030000000000000008.gz -rw-------. 1 pdbelsng_adm pdbelsng_adm 200 Mar 21 09:33 000000030000000000000009.00000060.backup.gz -rw-------. 1 pdbelsng_adm pdbelsng_adm 16475 Mar 21 09:33 000000030000000000000009.gz -rw-------. 1 pdbelsng_adm pdbelsng_adm 17225 Mar 21 09:33 00000003000000000000000A.gz -rw-------. 1 pdbelsng_adm pdbelsng_adm 16419 Mar 21 09:33 00000003000000000000000B.gz -rw-------. 1 pdbelsng_adm pdbelsng_adm 16389 Mar 21 09:36 00000003000000000000000C.partial.gz -rw-------. 1 pdbelsng_adm pdbelsng_adm 67 Mar 21 09:33 00000003.history.gz -rw-------. 1 pdbelsng_adm pdbelsng_adm 16461 Mar 21 09:39 00000004000000000000000C.gz -rw-------. 1 pdbelsng_adm pdbelsng_adm 200 Mar 21 09:39 00000004000000000000000D.00000028.backup.gz -rw-------. 1 pdbelsng_adm pdbelsng_adm 16486 Mar 21 09:39 00000004000000000000000D.gz -rw-------. 1 pdbelsng_adm pdbelsng_adm 16418 Mar 21 09:42 00000004000000000000000E.gz -rw-------. 1 pdbelsng_adm pdbelsng_adm 200 Mar 21 09:42 00000004000000000000000F.00000028.backup.gz -rw-------. 1 pdbelsng_adm pdbelsng_adm 16465 Mar 21 09:42 00000004000000000000000F.gz -rw-------. 1 pdbelsng_adm pdbelsng_adm 73 Mar 21 09:36 00000004.history.gz

When starting the backup I see this INFO message: INFO: start deleting old archived WAL files from ARCLOG_PATH (keep files = 10, keep days = -1) INFO: the threshold timestamp calculated by keep days is "2024-03-22 00:00:00"

Why they are not deleted ?

Maybe I missed something?

Thank you in advance and have a great day ahead!

TeodorChakalov commented 7 months ago

Hello, I am still waiting for your response.

Thank you.

zwyan0 commented 7 months ago

@TeodorChakalov Hi, I'm so sorry for the delay in response. Could you tell me which pkg you are using?

TeodorChakalov commented 7 months ago

Hello,

branch REL_14_STABLE

Thank you in advance.

TeodorChakalov commented 7 months ago

@zwyan0 Do you have any update about this topic ?

zwyan0 commented 7 months ago

@TeodorChakalov Hi ,Not in the same environment, but I checked pg15. It looks like the .gz files were not deleted.

[postgres@rocky9 archive_wal]$ cat $BACKUP_PATH/pg_rman.ini
ARCLOG_PATH='/var/lib/pgsql/archive_wal'
SRVLOG_PATH='/var/lib/pgsql/15/data/log'

BACKUP_MODE = F
COMPRESS_DATA = YES
KEEP_ARCLOG_FILES = 10
KEEP_ARCLOG_DAYS = -1

[postgres@rocky9 archive_wal]$ pg_rman backup --backup-mode=full --with-serverlog --progress
INFO: copying database files
Processed 1012 of 1012 files, skipped 0
INFO: copying archived WAL files
Processed 36 of 36 files, skipped 11
INFO: copying server log files
Processed 6 of 6 files, skipped 3
INFO: backup complete
INFO: Please execute 'pg_rman validate' to verify the files are correctly copied.
INFO: start deleting old archived WAL files from ARCLOG_PATH (keep files = 10, keep days = -1) 
INFO: the threshold timestamp calculated by keep days is "2024-04-16 00:00:00"
INFO: delete "000000010000000000000016"
INFO: delete "000000010000000000000015"
INFO: delete "000000010000000000000014"
INFO: delete "000000010000000000000013"
INFO: delete "000000010000000000000012"
INFO: delete "000000010000000000000011"
INFO: delete "000000010000000000000010"
INFO: delete "000000010000000000000010.00000028.backup"
INFO: delete "00000001000000000000000F"
INFO: delete "00000001000000000000000E"
INFO: delete "00000001000000000000000D"
INFO: delete "00000001000000000000000C"
INFO: delete "00000001000000000000000B"
INFO: delete "00000001000000000000000A"
INFO: delete "000000010000000000000009"
INFO: delete "000000010000000000000008"
INFO: delete "000000010000000000000007"

[postgres@rocky9 archive_wal]$ ls
000000010000000000000017  000000010000000000000021.gz
000000010000000000000018  000000010000000000000022.gz
000000010000000000000019  000000010000000000000023.gz
00000001000000000000001A  000000010000000000000024.gz
00000001000000000000001B  000000010000000000000025.gz
00000001000000000000001C  000000010000000000000026.gz
00000001000000000000001D  000000010000000000000027.gz
00000001000000000000001E  000000010000000000000028.00000028.backup.gz
00000001000000000000001F  000000010000000000000028.gz
000000010000000000000020
TeodorChakalov commented 7 months ago

@zwyan0 What you mean with this statement: "It looks like the .gz files were not deleted." ?

gz files cannot be deleted with this 2 parameters? KEEP_ARCLOG_FILES = 10 KEEP_ARCLOG_DAYS = -1

zwyan0 commented 7 months ago

@TeodorChakalov

I mean, It looks like a bug. Modifying the setting also only removed the normal wal file. During backup, after getting the list of archived WAL files, the files should be deleted according to the time of the file list. Or maybe just pg_rman not support to delete wal file when used gz to compress the wal files. I'm not sure. So, I will feedback this issue to the team.

[postgres@rocky9 archive_wal]$ pg_rman backup --backup-mode=full --with-serverlog --progress --verbose
========================================
backup start
----------------------------------------
# configuration
BACKUP_MODE=FULL
FULL_BACKUP_ON_ERROR=false
WITH_SERVERLOG=true
COMPRESS_DATA=true
----------------------------------------
INFO: copying database files
(1/1012) PG_VERSION compressed 11 (366.67% of 3)
...
(1011/1012) postmaster.opts compressed 35 (129.63% of 27)
(1012/1012) postmaster.pid compressed 93 (90.29% of 103)
database backup completed(read: 53543868 write: 8592777)
========================================
========================================
INFO: copying archived WAL files
(1/34) 000000010000000000000017 skip
(2/34) 000000010000000000000018 skip
(3/34) 000000010000000000000019 skip
(4/34) 00000001000000000000001A skip
(5/34) 00000001000000000000001B skip
(6/34) 00000001000000000000001C skip
(7/34) 00000001000000000000001D skip
(8/34) 00000001000000000000001E skip
(9/34) 00000001000000000000001F skip
(10/34) 000000010000000000000020 skip
(11/34) 000000010000000000000021.gz skip
(12/34) 000000010000000000000022.gz skip
(13/34) 000000010000000000000023.gz skip
(14/34) 000000010000000000000024.gz skip
(15/34) 000000010000000000000025.gz skip
(16/34) 000000010000000000000026.gz skip
(17/34) 000000010000000000000027.gz skip
(18/34) 000000010000000000000028.00000028.backup.gz skip
(19/34) 000000010000000000000028.gz skip
(20/34) 000000010000000000000029.gz skip
(21/34) 00000001000000000000002A.00000028.backup.gz skip
(22/34) 00000001000000000000002A.gz skip
(23/34) 00000001000000000000002B.gz skip
(24/34) 00000001000000000000002C.00000028.backup.gz skip
(25/34) 00000001000000000000002C.gz skip
(26/34) 00000001000000000000002D.gz skip
(27/34) 00000001000000000000002E.00000028.backup.gz skip
(28/34) 00000001000000000000002E.gz skip
(29/34) 00000001000000000000002F.gz skip
(30/34) 000000010000000000000030.00000028.backup.gz skip
(31/34) 000000010000000000000030.gz skip
(32/34) 000000010000000000000031.gz compressed 264 (1.61% of 16426)
(33/34) 000000010000000000000032.00000028.backup.gz compressed 209 (105.56% of 198)
(34/34) 000000010000000000000032.gz compressed 297 (1.80% of 16460)
archived WAL backup completed(read: 33084 write: 770)
========================================
========================================
INFO: copying server log files
(1/7) postgresql-2024-04-12_155856.log skip
(2/7) postgresql-2024-04-13_000000.log skip
(3/7) postgresql-2024-04-15_153240.log skip
(4/7) postgresql-2024-04-15_154302.log skip
(5/7) postgresql-2024-04-15_155037.log skip
(6/7) postgresql-2024-04-15_155710.log skip
(7/7) postgresql-2024-04-16_000000.log copied 5039
serverlog backup completed(read: 5039 write: 5039)
========================================
all backup completed(read: 53581991 write: 8598586)
========================================
INFO: backup complete
INFO: Please execute 'pg_rman validate' to verify the files are correctly copied.
INFO: start deleting old archived WAL files from ARCLOG_PATH (keep files = 5, keep days = -1)
INFO: the threshold timestamp calculated by keep days is "2024-04-17 00:00:00"
INFO: delete "00000001000000000000001B"  ★ normal wal  file was deleted
INFO: delete "00000001000000000000001A"
INFO: delete "000000010000000000000019"
INFO: delete "000000010000000000000018"
INFO: delete "000000010000000000000017"
========================================
delete online WAL backup
========================================
delete symbolic link in archive directory
[postgres@rocky9 archive_wal]$ pg_rman validate
INFO: validate: "2024-04-16 09:27:03" backup, archive log files and server log files by CRC
INFO: backup "2024-04-16 09:27:03" is valid
[postgres@rocky9 archive_wal]$ pg_rman validate
[postgres@rocky9 archive_wal]$ ls
00000001000000000000001C                     00000001000000000000002A.00000028.backup.gz 
00000001000000000000001D                     00000001000000000000002A.gz ★ still not deleted
00000001000000000000001E                     00000001000000000000002B.gz
00000001000000000000001F                     00000001000000000000002C.00000028.backup.gz
000000010000000000000020                     00000001000000000000002C.gz
000000010000000000000021.gz                  00000001000000000000002D.gz
000000010000000000000022.gz                  00000001000000000000002E.00000028.backup.gz
000000010000000000000023.gz                  00000001000000000000002E.gz
000000010000000000000024.gz                  00000001000000000000002F.gz
000000010000000000000025.gz                  000000010000000000000030.00000028.backup.gz
000000010000000000000026.gz                  000000010000000000000030.gz
000000010000000000000027.gz                  000000010000000000000031.gz
000000010000000000000028.00000028.backup.gz  000000010000000000000032.00000028.backup.gz
000000010000000000000028.gz                  000000010000000000000032.gz
TeodorChakalov commented 5 months ago

@zwyan0 Did you receive a feedback from the team?

Thank you in advance.