pgstef / check_pgbackrest

pgBackRest backup check plugin for Nagios
PostgreSQL License
36 stars 14 forks source link

Bug with --repo-s3 #8

Closed khadijahvf closed 4 years ago

khadijahvf commented 4 years ago

Hi,

I'm new to git, apologies if I'm not following the correct protocol.

My pgbackrest repo is hosted in the Amazon s3 bucket, and the archives service check is returning UNKNOWN.

$ check_pgbackrest --version check_pgbackrest version 1.9dev, Perl 5.16.3

$ check_pgbackrest --service archives --stanza tsdb_cluster --output human --ignore-archived-after 5m --ignore-archived-before 24h --repo-path pgbackrest/archive --repo-s3 --config /etc/pgbackrest/pgbackrest.conf --debug DEBUG: pgBackRest info command was : 'pgbackrest info --stanza=tsdb_cluster --output=json --config=/etc/pgbackrest/pgbackrest.conf' DEBUG: !> pgBackRest info took 0s DEBUG: archives_dir: pgbackrest/archive/tsdb_cluster/11-1 DEBUG: Get all the WAL archives and history files... DEBUG: cfg_file: /etc/pgbackrest/pgbackrest.conf DEBUG: repo1-s3-bucket: tsdb_cluster-dummybucket DEBUG: repo1-s3-endpoint: s3.ap-southeast-2.amazonaws.com Service : WAL_ARCHIVES Returns : 3 (UNKNOWN) Message : no archived WAL found

I think in my case, the issues are due to the 2 lines below:

Line 807 my $stream = $bucket->list({ prefix => $archives_dir, delimiter => '/' });

To fix my issue, I removed the delimiter option, as the gzipped files are in their own directories eg

$ aws s3 ls s3://tsdb_cluster-dummybucket/pgbackrest/archive/tsdb_cluster/11-1/0000001B00002296/ 2020-04-22 10:15:31 4717759 0000001B0000229600000000-1bf14812fa36aa0388e2df8f5958af0ecc3fef5d.gz 2020-04-22 10:15:45 4642067 0000001B0000229600000001-fbcf051e1d89e80949658acc4082450f4cc86269.gz 2020-04-22 10:16:04 4626255 0000001B0000229600000002-94d8c82583b32a0b192f0d0bf034fc39fb8b1de1.gz 2020-04-22 10:16:10 4792221 0000001B0000229600000003-f2c83ab8b5258b0eb71ebb319c600f75d683b4a1.gz 2020-04-22 10:16:17 4882457 0000001B0000229600000004-0869a9ab43150898b65ecbc6821b42012e28d019.gz 2020-04-22 10:16:32 4823052 0000001B0000229600000005-1853c727f38d956a55fdf92934170de1ea23e517.gz 2020-04-22 10:17:06 4822538 0000001B0000229600000006-55b5bf21c3c14be88c125615e39e1520a0b86773.gz 2020-04-22 10:17:21 4986754 0000001B0000229600000007-ddf82aee6e2bb13de5502277895b311212ec9b3f.gz 2020-04-22 10:17:39 4930414 0000001B0000229600000008-1778c1f0d22d3c42316bd41b7712efb72afba86e.gz 2020-04-22 10:18:21 4883000 0000001B0000229600000009-d719720c58a320e424f1746df9e15cc13d25afe3.gz 2020-04-22 10:18:40 5002275 0000001B000022960000000A-e29be205513c16ff6f656ff315c4486c5ed4ce83.gz 2020-04-22 10:18:52 4878073 0000001B000022960000000B-c82963cf48c1a3cc94e571c3676117131ca3e120.gz 2020-04-22 10:19:19 4878142 0000001B000022960000000C-deb82ea26cad4aa24a8239963b9187ddbbf2dfa9.gz 2020-04-22 10:19:50 4973264 0000001B000022960000000D-9c85a131b9e6ac6257ae294086604e3e83e41cbb.gz

Line 981 my $archives_dir = $args{'repo-path'}."/".$args{'stanza'}."/".$backups_info->{'archive'}[0]->{'id'};

To fix my issue, I added an extra / at the end of the string, eg DEBUG: archives_dir: pgbackrest/archive/tsdb_cluster/11-1/

Did I pass in the wrong values to the check_pgbackrest somehow, to avoid the temp workaround that I put in?

Thanks for any help.

Cheers, Khadijah

pgstef commented 4 years ago

Hi,

Unfortunately, I can't reproduce your issue.

Here's my configuration:

$ aws s3 ls s3://pgbackrest/repo1/archive/my_stanza/12-1/0000000100000000/
2020-04-22 08:32:45      16485 000000010000000000000003-a8e2697432b68f6385a8f6f41b6104fdd13ce33b.gz
2020-04-22 08:32:45        371 000000010000000000000003.00000028.backup
2020-04-22 08:32:47      16463 000000010000000000000004-a3aef31760fb481fc4a06604f10ebb618d8c7f96.gz
2020-04-22 08:32:47        370 000000010000000000000004.00000028.backup
2020-04-22 08:32:50      16460 000000010000000000000005-494754955ec338f28db753520f648f99a73daa35.gz
2020-04-22 08:32:50        372 000000010000000000000005.00000028.backup

$ check_pgbackrest --stanza=my_stanza --service=archives --repo-path=repo1/archive --repo-s3 --debug --output=human
DEBUG: pgBackRest info command was : 'pgbackrest info --stanza=my_stanza --output=json'
DEBUG: !> pgBackRest info took 0s
DEBUG: archives_dir: repo1/archive/my_stanza/12-1
DEBUG: Get all the WAL archives and history files...
DEBUG: pgBackRest version command was : 'pgbackrest version'
DEBUG: cfg_file: /etc/pgbackrest.conf
DEBUG: repo1-s3-bucket: pgbackrest
DEBUG: repo1-s3-endpoint: minio.local
DEBUG: !> Get all the WAL archives and history files took 1s
DEBUG: Get all the needed wal archives...
DEBUG: !> Get all the needed wal archives took 0s
DEBUG: !> Go through needed wal list and check took 0s
DEBUG: Get all the needed wal archives for 20200422-083239F...
DEBUG: Get all the needed wal archives for 20200422-083239F_20200422-083246D...
DEBUG: Get all the needed wal archives for 20200422-083239F_20200422-083248I...
DEBUG: !> Go through each backup, get the needed wal and check took 0s
Service        : WAL_ARCHIVES
Returns        : 0 (OK)
Message        : 3 WAL archived
Message        : latest archived since 1h8m15s
Long message   : latest_archive_age=1h8m15s
Long message   : num_archives=3
Long message   : archives_dir=repo1/archive/my_stanza/12-1
Long message   : min_wal=000000010000000000000003
Long message   : max_wal=000000010000000000000005
Long message   : latest_archive=000000010000000000000005
Long message   : latest_bck_archive_start=000000010000000000000005
Long message   : latest_bck_type=incr
Long message   : oldest_archive=000000010000000000000003
Long message   : oldest_bck_archive_start=000000010000000000000003
Long message   : oldest_bck_type=full

I don't want to mess with the archive_dir line 981.

Could you try with --repo-path=/pgbackrest/archive ?

The delimiter line 807 shouldn't change anything. Can you try the above with and without the delimiter please ?

Kind regards

khadijahvf commented 4 years ago

Thanks for the reply, much appreciated.

With the delimiter and --repo-path /pgbackrest/archive:

-bash-4.2$ check_pgbackrest --service archives --stanza tsdb_cluster --output human --ignore-archived-after 5m --ignore-archived-before 24h --repo-path /pgbackrest/archive --repo-s3 --config /etc/pgbackrest/pgbackrest.conf --repo-s3-over-http --debug
DEBUG: pgBackRest info command was : 'pgbackrest info --stanza=tsdb_cluster --output=json --config=/etc/pgbackrest/pgbackrest.conf'
DEBUG: !> pgBackRest info took 0s
DEBUG: archives_dir: /pgbackrest/archive/tsdb_cluster/11-1
DEBUG: Get all the WAL archives and history files...
DEBUG: pgBackRest version command was : 'pgbackrest version --config=/etc/pgbackrest/pgbackrest.conf'
DEBUG: cfg_file: /etc/pgbackrest/pgbackrest.conf
DEBUG: repo1-s3-bucket: mydummybucket
DEBUG: repo1-s3-endpoint: s3.ap-southeast-2.amazonaws.com
Service        : WAL_ARCHIVES
Returns        : 3 (UNKNOWN)
Message        : no archived WAL found

Without the delimiter and --repo-path /pgbackrest/archive:

$ check_pgbackrest --service archives --stanza tsdb_cluster --output human --ignore-archived-after 5m --ignore-archived-before 24h --repo-path /pgbackrest/archive --repo-s3 --config /etc/pgbackrest/pgbackrest.conf --repo-s3-over-http --debug
DEBUG: pgBackRest info command was : 'pgbackrest info --stanza=tsdb_cluster --output=json --config=/etc/pgbackrest/pgbackrest.conf'
DEBUG: !> pgBackRest info took 0s
DEBUG: archives_dir: /pgbackrest/archive/tsdb_cluster/11-1
DEBUG: Get all the WAL archives and history files...
DEBUG: pgBackRest version command was : 'pgbackrest version --config=/etc/pgbackrest/pgbackrest.conf'
DEBUG: cfg_file: /etc/pgbackrest/pgbackrest.conf
DEBUG: repo1-s3-bucket: mydummybucket
DEBUG: repo1-s3-endpoint: s3.ap-southeast-2.amazonaws.com
Service        : WAL_ARCHIVES
Returns        : 3 (UNKNOWN)
Message        : no archived WAL found

Without the delimiter and --repo-path=pgbackrest/archive:

-bash-4.2$ check_pgbackrest --service archives --stanza tsdb_cluster --output human --ignore-archived-after 5m --ignore-archived-before 24h --repo-path pgbackrest/archive --repo-s3 --config /etc/pgbackrest/pgbackrest.conf --repo-s3-over-http --debug
DEBUG: pgBackRest info command was : 'pgbackrest info --stanza=tsdb_cluster --output=json --config=/etc/pgbackrest/pgbackrest.conf'
DEBUG: !> pgBackRest info took 0s
DEBUG: archives_dir: pgbackrest/archive/tsdb_cluster/11-1
DEBUG: Get all the WAL archives and history files...
DEBUG: pgBackRest version command was : 'pgbackrest version --config=/etc/pgbackrest/pgbackrest.conf'
DEBUG: cfg_file: /etc/pgbackrest/pgbackrest.conf
DEBUG: repo1-s3-bucket: mydummybucket
DEBUG: repo1-s3-endpoint: s3.ap-southeast-2.amazonaws.com
DEBUG: !> Get all the WAL archives and history files took 1s
DEBUG: min_wal changed to 000000030000000000000007
DEBUG: max_wal changed to 000000030000000000000007
DEBUG: Get all the needed wal archives...
DEBUG: !> Get all the needed wal archives took 0s
DEBUG: !> Go through needed wal list and check took 0s
DEBUG: Get all the needed wal archives for 20200423-083723F...
DEBUG: !> Go through each backup, get the needed wal and check took 0s
Service        : WAL_ARCHIVES
Returns        : 0 (OK)
Message        : 1 WAL archived
Message        : latest archived since 33m50s
Long message   : latest_archive_age=33m50s
Long message   : num_archives=1
Long message   : archives_dir=pgbackrest/archive/tsdb_cluster/11-1
Long message   : min_wal=000000030000000000000007
Long message   : max_wal=000000030000000000000007
Long message   : latest_archive=000000030000000000000007
Long message   : latest_bck_archive_start=000000030000000000000007
Long message   : latest_bck_type=full
Long message   : oldest_archive=000000030000000000000007
Long message   : oldest_bck_archive_start=000000030000000000000007
Long message   : oldest_bck_type=full

Cheers, Khadijah

pgstef commented 4 years ago

Hi,

Thanks for your tests.

Based on your input, I imagine only removing the delimiter in the source code will be enough.

Could you confirm with :

$ check_pgbackrest --service=archives --stanza=tsdb_cluster --output=human --debug \
--repo-path=pgbackrest/archive --repo-s3 --config=/etc/pgbackrest/pgbackrest.conf

And confirm the repo content with :

$ pgbackrest info --stanza=tsdb_cluster --config=/etc/pgbackrest/pgbackrest.conf

If you confirm, I'll adjust the source code (remove the delimiter) and the regression tests accordingly.

Kind regards

khadijahvf commented 4 years ago

Great! Thank you.

Below is the output:

$ check_pgbackrest --service=archives --stanza=tsdb_cluster --output=human --repo-path=pgbackrest/archive --repo-s3 --config=/etc/pgbackrest/pgbackrest.conf --debug DEBUG: pgBackRest info command was : 'pgbackrest info --stanza=tsdb_cluster --output=json --config=/etc/pgbackrest/pgbackrest.conf' DEBUG: !> pgBackRest info took 0s DEBUG: archives_dir: pgbackrest/archive/tsdb_cluster/11-1 DEBUG: Get all the WAL archives and history files... DEBUG: pgBackRest version command was : 'pgbackrest version --config=/etc/pgbackrest/pgbackrest.conf' DEBUG: cfg_file: /etc/pgbackrest/pgbackrest.conf DEBUG: repo1-s3-bucket: mydummybucket DEBUG: repo1-s3-endpoint: s3.ap-southeast-2.amazonaws.com DEBUG: !> Get all the WAL archives and history files took 2s DEBUG: Get all the needed wal archives... DEBUG: !> Get all the needed wal archives took 0s DEBUG: !> Go through needed wal list and check took 0s DEBUG: Get all the needed wal archives for 20200423-083723F... DEBUG: !> Go through each backup, get the needed wal and check took 0s Service : WAL_ARCHIVES Returns : 0 (OK) Message : 1 WAL archived Message : latest archived since 8h22m57s Long message : latest_archive_age=8h22m57s Long message : num_archives=1 Long message : archives_dir=pgbackrest/archive/tsdb_cluster/11-1 Long message : min_wal=000000030000000000000007 Long message : max_wal=000000030000000000000007 Long message : latest_archive=000000030000000000000007 Long message : latest_bck_archive_start=000000030000000000000007 Long message : latest_bck_type=full Long message : oldest_archive=000000030000000000000007 Long message : oldest_bck_archive_start=000000030000000000000007 Long message : oldest_bck_type=full

$ pgbackrest info --stanza=tsdb_cluster --config=/etc/pgbackrest/pgbackrest.conf stanza: tsdb_cluster status: ok cipher: none

db (current)
    wal archive min/max (11-1): 000000030000000000000007/000000030000000000000007

    full backup: 20200423-083723F
        timestamp start/stop: 2020-04-23 08:37:23 / 2020-04-23 08:37:33
        wal start/stop: 000000030000000000000007 / 000000030000000000000007
        database size: 23.7MB, backup size: 23.7MB
        repository size: 2.8MB, repository backup size: 2.8MB

Cheers, Khadijah

pgstef commented 4 years ago

Alright. I've pushed the modification removing the delimiter. I've added your report to the chagelog too. Can you please confirm that this change works for you ? Kind regards,

khadijahvf commented 4 years ago

Yup, confirmed, it is working for me now. Thanks very much!

Cheers, Khadijah

pgstef commented 4 years ago

Great, I close this issue then. Kind regards,