pgstef / check_pgbackrest

pgBackRest backup check plugin for Nagios
PostgreSQL License
37 stars 14 forks source link

WAL_ARCHIVES UNKNOWN - no archived WAL found #5

Closed renesepp closed 5 years ago

renesepp commented 5 years ago

Hello,

check_pgbackrest returns WAL_ARCHIVES UNKNOWN - no archived WAL found when in fact a WAL files are present.

pgBackRest 2.15
repo-type=cifs
perl check_pgbackrest --service=archives --stanza=arendusdok --repo-path=/var/lib/pgbackrest/mountpoint2/archive --debug
DEBUG: pgBackRest info command was : 'pgbackrest info --stanza=arendusdok --output=json'
DEBUG: archives_dir: /var/lib/pgbackrest/mountpoint2/archive/arendusdok/9.4-1
WAL_ARCHIVES UNKNOWN - no archived WAL found

ls /var/lib/pgbackrest/mountpoint2/archive/arendusdok/9.4-1
00000001000000BC  00000001000000BD  00000001000000BE  00000001000000BF  00000001000000C0  00000001000000C1

ls -la /var/lib/pgbackrest/mountpoint2/archive/arendusdok/9.4-1/00000001000000BC/
-rwxr----- 1 backrest backrest     340 Jul  7 21:18 00000001000000BC00000034.00000028.backup
-rwxr----- 1 backrest backrest 4968826 Jul  7 20:31 00000001000000BC00000034-bb937d28591d63decaf81758a4b667607a866970.gz
-rwxr----- 1 backrest backrest 5077612 Jul  7 21:05 00000001000000BC00000035-1d4009dad36300b75270f79bea7abb5ce8fb1a24.gz
-rwxr----- 1 backrest backrest 1996573 Jul  7 21:18 00000001000000BC00000036-16efe841c07d95e40e6abe9edf67fec766af2184.gz
-rwxr----- 1 backrest backrest 4979708 Jul  7 21:50 00000001000000BC00000037-84aacb60393bdb98036e54d51e4b823d9baf839e.gz
...
pgstef commented 5 years ago

Hi,

Could you provide :

I'll try to reproduce your setup.

If check_pgbackrest is launched with the backrest user, I don't see any obvious problem.

Thanks, Kind regards,

renesepp commented 5 years ago

Hi,

CentOS Linux release 7.6.1810 (Core)

[global] repo-type=cifs repo-path=/var/lib/pgbackrest/mountpoint2 process-max=4 compress=n compress-level-network=3 retention-full=4 retention-diff=15 start-fast=y stop-auto=y [arendusdok] db-host=example.domain db-path=/var/lib/pgsql/9.4/data db-user=postgres process-max=2

[global] backup-host=pgbackrest.example.domain backup-user=backrest log-level-file=detail archive-async=n

[global:archive-get] process-max=2 [global:archive-push] process-max=2

[arendusdok] db-path=/var/lib/pgsql/9.4/data

stanza: arendusdok status: ok cipher: none

db (current)
    wal archive min/max (9.4-1): 00000001000000BD000000DF/00000001000000C4000000E3

    full backup: 20190714-200002F
        timestamp start/stop: 2019-07-14 20:00:02 / 2019-07-14 20:37:45
        wal start/stop: 00000001000000BD000000DF / 00000001000000BD000000E0
        database size: 13.7GB, backup size: 13.7GB
        repository size: 13.7GB, repository backup size: 13.7GB

    full backup: 20190721-200001F
        timestamp start/stop: 2019-07-21 20:00:01 / 2019-07-21 20:36:43
        wal start/stop: 00000001000000BF000000B0 / 00000001000000BF000000B0
        database size: 13.8GB, backup size: 13.8GB
        repository size: 13.8GB, repository backup size: 13.8GB

    diff backup: 20190721-200001F_20190722-200001D
        timestamp start/stop: 2019-07-22 20:00:01 / 2019-07-22 20:06:21
        wal start/stop: 00000001000000BF000000D8 / 00000001000000BF000000D8
        database size: 13.8GB, backup size: 8.2GB
        repository size: 13.8GB, repository backup size: 8.2GB
        backup reference list: 20190721-200001F

    diff backup: 20190721-200001F_20190723-200001D
        timestamp start/stop: 2019-07-23 20:00:01 / 2019-07-23 20:07:39
        wal start/stop: 00000001000000C000000003 / 00000001000000C000000003
        database size: 13.9GB, backup size: 9.2GB
        repository size: 13.9GB, repository backup size: 9.2GB
        backup reference list: 20190721-200001F

    diff backup: 20190721-200001F_20190724-200002D
        timestamp start/stop: 2019-07-24 20:00:02 / 2019-07-24 20:07:47
        wal start/stop: 00000001000000C00000002F / 00000001000000C00000002F
        database size: 13.9GB, backup size: 10.3GB
        repository size: 13.9GB, repository backup size: 10.3GB
        backup reference list: 20190721-200001F

    diff backup: 20190721-200001F_20190725-200001D
        timestamp start/stop: 2019-07-25 20:00:01 / 2019-07-25 20:09:38
        wal start/stop: 00000001000000C00000009F / 00000001000000C00000009F
        database size: 14GB, backup size: 12.4GB
        repository size: 14GB, repository backup size: 12.4GB
        backup reference list: 20190721-200001F

    diff backup: 20190721-200001F_20190726-200001D
        timestamp start/stop: 2019-07-26 20:00:01 / 2019-07-26 20:09:35
        wal start/stop: 00000001000000C0000000E1 / 00000001000000C0000000E1
        database size: 14.1GB, backup size: 12.4GB
        repository size: 14.1GB, repository backup size: 12.4GB
        backup reference list: 20190721-200001F

    diff backup: 20190721-200001F_20190727-200001D
        timestamp start/stop: 2019-07-27 20:00:01 / 2019-07-27 20:11:48
        wal start/stop: 00000001000000C100000013 / 00000001000000C100000013
        database size: 14.1GB, backup size: 12.4GB
        repository size: 14.1GB, repository backup size: 12.4GB
        backup reference list: 20190721-200001F

    full backup: 20190728-200002F
        timestamp start/stop: 2019-07-28 20:00:02 / 2019-07-28 20:37:00
        wal start/stop: 00000001000000C100000045 / 00000001000000C100000046
        database size: 14.1GB, backup size: 14.1GB
        repository size: 14.1GB, repository backup size: 14.1GB

    diff backup: 20190728-200002F_20190729-200001D
        timestamp start/stop: 2019-07-29 20:00:01 / 2019-07-29 20:06:57
        wal start/stop: 00000001000000C1000000C2 / 00000001000000C1000000C2
        database size: 14.3GB, backup size: 9.5GB
        repository size: 14.3GB, repository backup size: 9.5GB
        backup reference list: 20190728-200002F

    diff backup: 20190728-200002F_20190730-200001D
        timestamp start/stop: 2019-07-30 20:00:01 / 2019-07-30 20:08:48
        wal start/stop: 00000001000000C200000003 / 00000001000000C200000003
        database size: 14.3GB, backup size: 10.3GB
        repository size: 14.3GB, repository backup size: 10.3GB
        backup reference list: 20190728-200002F

    diff backup: 20190728-200002F_20190731-200001D
        timestamp start/stop: 2019-07-31 20:00:01 / 2019-07-31 20:09:27
        wal start/stop: 00000001000000C2000000AF / 00000001000000C2000000AF
        database size: 14.6GB, backup size: 12GB
        repository size: 14.6GB, repository backup size: 12GB
        backup reference list: 20190728-200002F

    diff backup: 20190728-200002F_20190801-200001D
        timestamp start/stop: 2019-08-01 20:00:01 / 2019-08-01 20:11:50
        wal start/stop: 00000001000000C30000006A / 00000001000000C30000006A
        database size: 14.8GB, backup size: 12.3GB
        repository size: 14.8GB, repository backup size: 12.3GB
        backup reference list: 20190728-200002F

    diff backup: 20190728-200002F_20190802-200001D
        timestamp start/stop: 2019-08-02 20:00:01 / 2019-08-02 20:09:48
        wal start/stop: 00000001000000C3000000B4 / 00000001000000C3000000B4
        database size: 14.8GB, backup size: 12.3GB
        repository size: 14.8GB, repository backup size: 12.3GB
        backup reference list: 20190728-200002F

    diff backup: 20190728-200002F_20190803-200002D
        timestamp start/stop: 2019-08-03 20:00:02 / 2019-08-03 20:14:15
        wal start/stop: 00000001000000C3000000E6 / 00000001000000C3000000E6
        database size: 14.8GB, backup size: 12.3GB
        repository size: 14.8GB, repository backup size: 12.3GB
        backup reference list: 20190728-200002F

    full backup: 20190804-200001F
        timestamp start/stop: 2019-08-04 20:00:01 / 2019-08-04 21:15:35
        wal start/stop: 00000001000000C400000017 / 00000001000000C400000019
        database size: 14.8GB, backup size: 14.8GB
        repository size: 14.8GB, repository backup size: 14.8GB

    diff backup: 20190804-200001F_20190805-200002D
        timestamp start/stop: 2019-08-05 20:00:02 / 2019-08-05 20:08:32
        wal start/stop: 00000001000000C4000000B6 / 00000001000000C4000000B6
        database size: 15GB, backup size: 10.9GB
        repository size: 15GB, repository backup size: 10.9GB
        backup reference list: 20190804-200001F
pgstef commented 5 years ago

Hm. I don't really see any problem here. I tried to replicate your setup, that gave me :

-bash-4.2$ pgbackrest info
stanza: arendusdok
    status: ok
    cipher: none

    db (current)
        wal archive min/max (9.4-1): 000000010000000000000003/00000001000000020000001A

        full backup: 20190808-101420F
            timestamp start/stop: 2019-08-08 10:14:20 / 2019-08-08 10:14:33
            wal start/stop: 000000010000000000000003 / 000000010000000000000003
            database size: 20.1MB, backup size: 20.1MB
            repository size: 20.1MB, repository backup size: 20.1MB

        diff backup: 20190808-101420F_20190808-102200D
            timestamp start/stop: 2019-08-08 10:22:00 / 2019-08-08 10:22:34
            wal start/stop: 0000000100000000000000A1 / 0000000100000000000000A1
            database size: 1.5GB, backup size: 1.5GB
            repository size: 1.5GB, repository backup size: 1.5GB
            backup reference list: 20190808-101420F

        full backup: 20190808-103358F
            timestamp start/stop: 2019-08-08 10:33:58 / 2019-08-08 10:34:52
            wal start/stop: 000000010000000100000058 / 00000001000000010000005A
            database size: 1.5GB, backup size: 1.5GB
            repository size: 1.5GB, repository backup size: 1.5GB

        diff backup: 20190808-103358F_20190808-103553D
            timestamp start/stop: 2019-08-08 10:35:53 / 2019-08-08 10:36:18
            wal start/stop: 00000001000000010000006A / 000000010000000100000071
            database size: 309.8MB, backup size: 283.8MB
            repository size: 309.8MB, repository backup size: 283.8MB
            backup reference list: 20190808-103358F

-bash-4.2$ perl check_pgbackrest --service=archives --stanza=arendusdok --repo-path=/var/lib/pgbackrest/mountpoint2/archive
WAL_ARCHIVES OK - 536 WAL archived, latest archived since 4m18s | latest_archive_age=4m18s num_archives=536

With pgbackrest 2.15, PostgreSQL 9.4 and check_pgbackrest master.

renesepp commented 5 years ago

Hm. I don't really see any problem here. I tried to replicate your setup, that gave me :

-bash-4.2$ pgbackrest info
stanza: arendusdok
    status: ok
    cipher: none

    db (current)
        wal archive min/max (9.4-1): 000000010000000000000003/00000001000000020000001A

        full backup: 20190808-101420F
            timestamp start/stop: 2019-08-08 10:14:20 / 2019-08-08 10:14:33
            wal start/stop: 000000010000000000000003 / 000000010000000000000003
            database size: 20.1MB, backup size: 20.1MB
            repository size: 20.1MB, repository backup size: 20.1MB

        diff backup: 20190808-101420F_20190808-102200D
            timestamp start/stop: 2019-08-08 10:22:00 / 2019-08-08 10:22:34
            wal start/stop: 0000000100000000000000A1 / 0000000100000000000000A1
            database size: 1.5GB, backup size: 1.5GB
            repository size: 1.5GB, repository backup size: 1.5GB
            backup reference list: 20190808-101420F

        full backup: 20190808-103358F
            timestamp start/stop: 2019-08-08 10:33:58 / 2019-08-08 10:34:52
            wal start/stop: 000000010000000100000058 / 00000001000000010000005A
            database size: 1.5GB, backup size: 1.5GB
            repository size: 1.5GB, repository backup size: 1.5GB

        diff backup: 20190808-103358F_20190808-103553D
            timestamp start/stop: 2019-08-08 10:35:53 / 2019-08-08 10:36:18
            wal start/stop: 00000001000000010000006A / 000000010000000100000071
            database size: 309.8MB, backup size: 283.8MB
            repository size: 309.8MB, repository backup size: 283.8MB
            backup reference list: 20190808-103358F

-bash-4.2$ perl check_pgbackrest --service=archives --stanza=arendusdok --repo-path=/var/lib/pgbackrest/mountpoint2/archive
WAL_ARCHIVES OK - 536 WAL archived, latest archived since 4m18s | latest_archive_age=4m18s num_archives=536

With pgbackrest 2.15, PostgreSQL 9.4 and check_pgbackrest master.

Hm. Did you test with a repo-type=cifs or only posix? I did test with posix and archive check seems to work there.

CIFS mount options

vers=3.0,sec=ntlmssp,uid=backrest,gid=backrest,dir_mode=0750,file_mode=0740

pgstef commented 5 years ago

I tried with repo-type=cifs, exact same configuration file as yours. But indeed, not on a real CIFS mount. I will try that.

Meanwhile, that's a pgBackRest configuration option. It shouldn't impact check_pgbackrest.

Could you provide the result of the pgbackrest info --stanza=arendusdok --output=json command with repo-type as cifs and as posix to check if there's any difference there ?

I don't really see why it would have an impact on check_pgbackrest itself but it worth trying.

Kind regards

pgstef commented 5 years ago

Ok. I was on a "local" repository even if pgbackrest had the repo-type=cifs option. I've tried to put the repo on a real cifs mount point and have been able to reproduce the "UNKNOWN" error.

To solve the problem, I have to add follow => 1 there : https://github.com/dalibo/check_pgbackrest/blob/d8d45effa1986ba88f8e1aefbe64aa676fb224f6/check_pgbackrest#L662

(You can temporarily add it to solve your problem)

I'll try to run some regression tests with that modification on other test cases to see if I need to add a specific option to activate it or not.

That will be added in the next release.

Thanks for reporting and helping debug the behaviour.

Kind regards

Krysztophe commented 5 years ago

Problem reproduced; fix confirmed; seems at 1st not to have a negative impact

renesepp commented 5 years ago

Ok. I was on a "local" repository even if pgbackrest had the repo-type=cifs option. I've tried to put the repo on a real cifs mount point and have been able to reproduce the "UNKNOWN" error.

To solve the problem, I have to add follow => 1 there :

https://github.com/dalibo/check_pgbackrest/blob/d8d45effa1986ba88f8e1aefbe64aa676fb224f6/check_pgbackrest#L662

(You can temporarily add it to solve your problem)

I'll try to run some regression tests with that modification on other test cases to see if I need to add a specific option to activate it or not.

That will be added in the next release.

Thanks for reporting and helping debug the behaviour.

Kind regards

Fix confirmed, works fine now with a real CIFS mount.

Thanks.

pgstef commented 5 years ago

Hi,

The commit https://github.com/dalibo/check_pgbackrest/commit/af85ec8ffd64687849f761e0bf6d3d1b0a14a5b3 adds that modification and add some regress test with a CIFS mount.

Thanks again for reporting.

Kind regards