pgstef / check_pgbackrest

pgBackRest backup check plugin for Nagios
PostgreSQL License
35 stars 14 forks source link

Memory leak? #20

Closed jaymefSO closed 3 years ago

jaymefSO commented 3 years ago

I observed an issue the other day. I am running check_pgbackrest 2.0 and Perl 5.26.3 via NRPE on my CentOS 8.3 backup server. I have two stanzas listed below (pgbackrest info output included below).

The backup server ran out of memory a few days ago, after thorough investigation I was able to trace it back to check_pgbackrest. Due to the fact the NRPE systemd service by default has a -500 OOM adjustment, scripts launched under NRPE inherit the adjustment, and as such other more important processes get killed by OOM killer first.

I noticed in dmesg at the time of the memory event there were several running check_pgbackrest processes consuming large amounts of memory so I decided to test manually.

Running check_pgbackrest -s archives -S $stanza on the first stanza caracara works fine, it completes in a matter of seconds, however running the same check on my other stanza goutou hangs for a very long time and consumes a lot of resources (I'm not sure if it ever completes). This has always worked before, so I am trying to understand if the issue is a bug/memory leak with check_pgbackrest or if there is perhaps some sort of issue with my archives that is making check_pgbackrest behave in this manner.

I used --debug and it's getting hung up here:

DEBUG: Get all the needed WAL archives...

I attempted to strace the process and I'm just getting a loop of BRK calls like this:

strace /usr/lib64/nagios/plugins/check_pgbackrest -s archives -S goutou

brk(0x555ebe74a000)                     = 0x555ebe74a000
brk(NULL)                               = 0x555ebe74a000
brk(0x555ebe76b000)                     = 0x555ebe76b000
mremap(0x7f8f624dd000, 4366336, 5242880, MREMAP_MAYMOVE) = 0x7f8f624dd000
brk(NULL)                               = 0x555ebe76b000
brk(0x555ebe78c000)                     = 0x555ebe78c000
brk(NULL)                               = 0x555ebe78c000
brk(0x555ebe7ad000)                     = 0x555ebe7ad000
brk(NULL)                               = 0x555ebe7ad000
brk(0x555ebe7ce000)                     = 0x555ebe7ce000
brk(NULL)                               = 0x555ebe7ce000
brk(0x555ebe7ef000)                     = 0x555ebe7ef000
brk(NULL)                               = 0x555ebe7ef000
brk(0x555ebe810000)                     = 0x555ebe810000
brk(NULL)                               = 0x555ebe810000
brk(0x555ebe831000)                     = 0x555ebe831000
brk(NULL)                               = 0x555ebe831000
brk(0x555ebe852000)                     = 0x555ebe852000
brk(NULL)                               = 0x555ebe852000
brk(0x555ebe873000)                     = 0x555ebe873000
brk(NULL)                               = 0x555ebe873000
brk(0x555ebe894000)                     = 0x555ebe894000
brk(NULL)                               = 0x555ebe894000
brk(0x555ebe8b5000)                     = 0x555ebe8b5000
brk(NULL)                               = 0x555ebe8b5000
brk(0x555ebe8d6000)                     = 0x555ebe8d6000
 pgbackrest info
stanza: caracara
    status: ok
    cipher: none

    db (current)
        wal archive min/max (11): 00000001000003EC000000D8/00000001000003F3000000EA

        full backup: 20210509-064502F
            timestamp start/stop: 2021-05-09 06:45:02 / 2021-05-09 07:00:30
            wal start/stop: 00000001000003EC000000D8 / 00000001000003EC000000D8
            database size: 36GB, database backup size: 36GB
            repo1: backup set size: 7.4GB, backup size: 7.4GB

        full backup: 20210516-064502F
            timestamp start/stop: 2021-05-16 06:45:02 / 2021-05-16 07:00:41
            wal start/stop: 00000001000003EE00000076 / 00000001000003EE00000076
            database size: 36.2GB, database backup size: 36.2GB
            repo1: backup set size: 7.4GB, backup size: 7.4GB

        full backup: 20210523-064502F
            timestamp start/stop: 2021-05-23 06:45:02 / 2021-05-23 07:00:36
            wal start/stop: 00000001000003F000000005 / 00000001000003F000000005
            database size: 36GB, database backup size: 36GB
            repo1: backup set size: 7.3GB, backup size: 7.3GB

        full backup: 20210530-064503F
            timestamp start/stop: 2021-05-30 06:45:03 / 2021-05-30 07:00:42
            wal start/stop: 00000001000003F1000000CE / 00000001000003F1000000CE
            database size: 36.1GB, database backup size: 36.1GB
            repo1: backup set size: 7.3GB, backup size: 7.3GB

        full backup: 20210606-064503F
            timestamp start/stop: 2021-06-06 06:45:03 / 2021-06-06 07:00:44
            wal start/stop: 00000001000003F300000065 / 00000001000003F300000065
            database size: 36.2GB, database backup size: 36.2GB
            repo1: backup set size: 7.4GB, backup size: 7.4GB

        diff backup: 20210606-064503F_20210607-064503D
            timestamp start/stop: 2021-06-07 06:45:03 / 2021-06-07 06:52:21
            wal start/stop: 00000001000003F3000000A1 / 00000001000003F3000000A1
            database size: 36.3GB, database backup size: 16.2GB
            repo1: backup set size: 7.4GB, backup size: 3GB
            backup reference list: 20210606-064503F

        diff backup: 20210606-064503F_20210608-064503D
            timestamp start/stop: 2021-06-08 06:45:03 / 2021-06-08 06:52:47
            wal start/stop: 00000001000003F3000000D9 / 00000001000003F3000000D9
            database size: 36.3GB, database backup size: 17.3GB
            repo1: backup set size: 7.4GB, backup size: 3.2GB
            backup reference list: 20210606-064503F

stanza: goutou
    status: ok
    cipher: none

    db (current)
        wal archive min/max (11): 000000010000042D00000031/000000020000042C00000038

        full backup: 20210514-111800F
            timestamp start/stop: 2021-05-14 11:18:00 / 2021-05-14 11:40:08
            wal start/stop: 000000010000042D00000031 / 000000010000042D00000032
            database size: 38.6GB, database backup size: 38.6GB
            repo1: backup set size: 11.9GB, backup size: 11.9GB

        full backup: 20210516-074502F
            timestamp start/stop: 2021-05-16 07:45:02 / 2021-05-16 08:06:04
            wal start/stop: 000000010000042D000000AA / 000000010000042D000000AA
            database size: 38.6GB, database backup size: 38.6GB
            repo1: backup set size: 12.0GB, backup size: 12.0GB

        full backup: 20210523-074502F
            timestamp start/stop: 2021-05-23 07:45:02 / 2021-05-23 08:06:28
            wal start/stop: 00000001000004300000000E / 00000001000004300000000E
            database size: 38.8GB, database backup size: 38.8GB
            repo1: backup set size: 12GB, backup size: 12GB

        full backup: 20210530-074503F
            timestamp start/stop: 2021-05-30 07:45:03 / 2021-05-30 08:07:04
            wal start/stop: 000000010000043200000070 / 000000010000043200000070
            database size: 38.9GB, database backup size: 38.9GB
            repo1: backup set size: 12GB, backup size: 12GB

        full backup: 20210606-074502F
            timestamp start/stop: 2021-06-06 07:45:02 / 2021-06-06 08:07:49
            wal start/stop: 0000000100000434000000D0 / 0000000100000434000000D0
            database size: 39.1GB, database backup size: 39.1GB
            repo1: backup set size: 12.2GB, backup size: 12.2GB

        diff backup: 20210606-074502F_20210607-074503D
            timestamp start/stop: 2021-06-07 07:45:03 / 2021-06-07 07:59:28
            wal start/stop: 00000001000004350000001A / 00000001000004350000001A
            database size: 39.1GB, database backup size: 26.5GB
            repo1: backup set size: 12.2GB, backup size: 7.4GB
            backup reference list: 20210606-074502F

        diff backup: 20210606-074502F_20210608-074502D
            timestamp start/stop: 2021-06-08 07:45:02 / 2021-06-08 07:59:39
            wal start/stop: 00000001000004350000007A / 00000001000004350000007A
            database size: 39.2GB, database backup size: 27.6GB
            repo1: backup set size: 12.2GB, backup size: 7.9GB
            backup reference list: 20210606-074502F
pgstef commented 3 years ago

Hi,

I guess the issue comes from the timeline switch: 000000010000042D00000031/000000020000042C00000038. The archives service needs to "guess" what the next wal should be, and that's sometimes tricky in case of timeline switch.

We have to parse the .history file to find the switch point. If it doesn't found that boundary, there's a possibility for an infinite loop. I should probably add an option with "maximum number of wal files to search" to prevent that.

Could you have a look in the pgbackrest repo, in the archives part, there should be a 00000002.history file. With that, we could guess the boundary wal file and check manually if it exists in the repo...

Kind regards

jaymefSO commented 3 years ago

Thanks for the speedy reply. Here is the contents of the history file:

1 42C/3848FF00 before 2021-05-11 15:59:00.326349+00

If there is a problem with the archives is there a way I can correct it? Would expiringfull backup: 20210514-111800F by taking another full backup do the trick?

pgstef commented 3 years ago

Basically said, PostgreSQL performs a timeline switch after each successful recovery or promotion. The timeline id is then part of the WAL filename to identify their timeline.

Here, my guess would be that you tried to perform a test restore, without preventing that restored "copy" to push archives to the repo. Your recovery target 2021-05-11 15:59:00 is even older than your latest production backup 2021-06-08 07:45:02 which still seems to be on the 1st timeline.

Given the history file, it would mean the boundary is 000000020000042C00000038. And it seems to be the "max wal" in your repo and I guess the only wal on timeline 2 too.

I don't think expiring your oldest backup would solve this. In fact, the question is more about the existence of the 2nd timeline. If it was a test, and an error, your probably want to remove all the 00000002.... wal archives from your repo. If you also remove the 00000002.history file, PostgreSQL will be able to pick that timeline again next time, otherwise it will go up to 3. Don't be worry, timeline ids are hexadecimal, so there's a huge number of ids available. It's just a matter of "cleaning".

This is not really related to pgBackRest or monitoring, but more to how PostgreSQL behave in case of restore/recovery (which is always the tricky part of PITR).

Removing the unneeded archives from your repo should clean your situation and solve the issue, even if I should probably also add some safeguard against that nasty infinity loop.

Sorry about that issue.

Kind Regards

jaymefSO commented 3 years ago

Thank-you very much. I think you're exactly right, there was a restore performed to a remote server (someone needed to view contents of a database from a specific time), I believe that restore was done improperly, archive mode was not turned off as it should have been leading to the restored copy pushing to the archive.

Just to be clear, you're saying that it should be safe to remove 000000020000042C folder (this is only folder in archive sub-dir that starts with 00000002 along with the 00000002.history file?

Is that all that is needed to be done, or do any other commands need to be run after doing so?

jaymefSO commented 3 years ago

I moved out the dir and history file and the check is working properly now:

# /usr/lib64/nagios/plugins/check_pgbackrest -s archives -S goutou --output=human --debug
DEBUG: pgBackRest info command was : 'pgbackrest info --stanza=goutou --output=json --log-level-console=error'
DEBUG: !> pgBackRest info took 0s
DEBUG: archives_dir: archive/goutou/11-1
DEBUG: Get all the WAL archives and history files...
DEBUG: pgBackRest version command was : 'pgbackrest version'
DEBUG: pgBackRest ls command was : 'pgbackrest repo-ls --stanza=goutou archive/goutou/11-1 --output=json --log-level-console=error --recurse'
DEBUG: !> Get all the WAL archives and history files took 0s
DEBUG: Get all the needed WAL archives...
DEBUG: !> Get all the needed WAL archives took 0s
DEBUG: !> Go through needed WAL list and check took 0s
DEBUG: Get all the needed WAL archives for 20210514-111800F...
DEBUG: Get all the needed WAL archives for 20210516-074502F...
DEBUG: Get all the needed WAL archives for 20210523-074502F...
DEBUG: Get all the needed WAL archives for 20210530-074503F...
DEBUG: Get all the needed WAL archives for 20210606-074502F...
DEBUG: Get all the needed WAL archives for 20210606-074502F_20210607-074503D...
DEBUG: Get all the needed WAL archives for 20210606-074502F_20210608-074502D...
DEBUG: !> Go through each backup, get the needed WAL and check took 0s
Service        : WAL_ARCHIVES
Returns        : 0 (OK)
Message        : 2160 WAL archived
Message        : latest archived since 7m8s
Long message   : latest_archive_age=7m8s
Long message   : num_archives=2160
Long message   : archives_dir=archive/goutou/11-1
Long message   : min_wal=000000010000042D00000031
Long message   : max_wal=0000000100000435000000A0
Long message   : latest_archive=0000000100000435000000A0
Long message   : latest_bck_archive_start=00000001000004350000007A
Long message   : latest_bck_type=diff
Long message   : oldest_archive=000000010000042D00000031
Long message   : oldest_bck_archive_start=000000010000042D00000031
Long message   : oldest_bck_type=full
pgstef commented 3 years ago

Exactly. If the 2nd timeline was generated by error, you can safely remove 000000020000042C folder and 00000002.history file. That will clean your repo and no other operation/command is needed then :-)

The best thing to look is your wal archive min which match with your oldest backup wal start and your latest backup wal stop which is still on timeline 1. Since the timeline switch described in the .history happened before you latest backup, I'd say it's safe to remove everything from timeline 2 in your repo.

Kind Regards

jaymefSO commented 3 years ago

Thank-you very much again, I really appreciate the quick assistance with this. You saved me a lot of troubles! It would indeed be nice to see that infinite loop bug fixed in a future release.

sasoiliev commented 2 years ago

@pgstef I also ran into this issue, however we are running a PAF cluster, so a timeline switch is something expected (e.g., after a failover or a manual switchover of the master node).

I saw that there's now the --max-archives-check-number, but in our case this doesn't seem to help much. It will help us avoid running into the out-of-memory error, but still wouldn't allow us to check if our WALs are available on the backup server.

(For now we've disabled the archives check and we're only doing the -s retention one.)

Is this something that is conceptually hard to do?

Thanks!

pgstef commented 2 years ago

Hi,

@pgstef I also ran into this issue, however we are running a PAF cluster, so a timeline switch is something expected (e.g., after a failover or a manual switchover of the master node).

First of all, don't comment on closed issues please. It would be nice to open a new one with your setup and how you encountered an issue. Timeline switch can happen often (recovery, standby promotion,...), no matter if you have a HA management tool or not. That's expected indeed. History files pushed to the backup repo is expected too.

Is this something that is conceptually hard to do?

Imagine you have your min wal = 000000010000000000000001 (before TL switch) and your max wal = 000000020000000000000010 (after TL switch). Without the history file to detect the boundary wal (where the TL switch happened), it is impossible to predict when to jump from 00000001 to 00000002. And since the loop is basically while "next wal to check" <= "max wal", I don't see how to avoid the infinite loop. That being said, probably a "safeguard" could be to check if boundary wal can be defined for each TL switch. I've added something like this expecting to help detect the missing pieces sooner.

So conceptually, there's probably something with your repo content leading to that issue (missing history file?). That's why it would have been nice to have more info by opening a new issue giving more context.

Kind Regards

sasoiliev commented 2 years ago

Apologies for commenting on a closed issue and thanks for the details provided.

I'll investigate further and log a separate issue if needed.

Best regards!