vitabaks / postgresql_cluster

PostgreSQL High-Availability Cluster (based on "Patroni" and DCS "etcd" or "consul"). Automating with Ansible.
MIT License
1.29k stars 352 forks source link

ERROR: [068]: archive_command 'cd .' must contain pgbackrest #393

Closed chuegel closed 10 months ago

chuegel commented 10 months ago

I was trying to do a full backup pgbackrest --stanza=postgres-cluster --type=full backup from a pgbackrest server and got this error:

ERROR: [068]: archive_command 'cd .' must contain pgbackrest
2023-07-04 17:37:56.033 P00   INFO: backup command end: aborted with exception [068]

Cluster was deployed with the lastest (1.7.0) version and is up and running.

Any hints how to fix this?

vitabaks commented 10 months ago

Hi

  1. Change archive_command in postgresql_parameters variable
  - { option: "archive_command", value: "pgbackrest --stanza={{ pgbackrest_stanza }} archive-push %p" }  # archive WAL-s using pgbackrest
  1. Apply changes
ansible-playbook config_pgcluster.yml --tags patroni_conf
  1. Try backup
vitabaks commented 10 months ago

I will try to automate the setting of the archive_command parameter based on the selected backup tool because changing this command manually is not very convenient (and it is not obvious to everyone)

vitabaks commented 10 months ago

I will try to automate the setting of the archive_command parameter

PR https://github.com/vitabaks/postgresql_cluster/pull/394

chuegel commented 10 months ago

Thanks but I did encounter another issue.

Logs from master

2023-07-05 17:30:12.921 P00   INFO: archive-push command begin 2.46: [pg_wal/000000010000000000000001] --exec-id=13710-7871c9c1 --log-level-console=info --log-level-file=detail --log-path=/var/log/pgbackrest --pg1-path=/var/lib/postgresql/15/main --process-max=4 --repo1-path=/var/lib/pgbackrest --repo1-type=posix --stanza=postgres-cluster
ERROR: [103]: unable to find a valid repository:
       repo1: [FileMissingError] unable to load info file '/var/lib/pgbackrest/archive/postgres-cluster/archive.info' or '/var/lib/pgbackrest/archive/postgres-cluster/archive.info.copy':
       FileMissingError: unable to open missing file '/var/lib/pgbackrest/archive/postgres-cluster/archive.info' for read
       FileMissingError: unable to open missing file '/var/lib/pgbackrest/archive/postgres-cluster/archive.info.copy' for read
       HINT: archive.info cannot be opened but is required to push/get WAL segments.
       HINT: is archive_command configured correctly in postgresql.conf?
       HINT: has a stanza-create been performed?
       HINT: use --no-archive-check to disable archive checks during backup if you have an alternate archiving scheme.
2023-07-05 17:30:12.928 P00   INFO: archive-push command end: aborted with exception [103]
2023-07-05 17:30:12 UTC [13516-1]  LOG:  archive command failed with exit code 103
2023-07-05 17:30:12 UTC [13516-2]  DETAIL:  The failed archive command was: pgbackrest --stanza=postgres-cluster archive-push pg_wal/000000010000000000000001
2023-07-05 17:30:12 UTC [13508-2]  LOG:  checkpoint complete: wrote 68 buffers (0.1%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.013 s, sync=0.003 s, total=0.103 s; sync files=15, longest=0.001 s, average=0.001 s; distance=8961 kB, estimate=8961 kB
2023-07-05 17:30:13 UTC [13508-3]  LOG:  checkpoint starting: immediate force wait
2023-07-05 17:30:13 UTC [13508-4]  LOG:  checkpoint complete: wrote 0 buffers (0.0%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.003 s, sync=0.001 s, total=0.191 s; sync files=0, longest=0.000 s, average=0.000 s; distance=16384 kB, estimate=16384 kB

From the dedicated pgbackrest server:

postgres@patroni-backup:~$ pgbackrest --stanza=postgres-cluster --type=full backup
WARN: configuration file contains invalid option 'include-path'
2023-07-05 17:40:16.183 P00   INFO: backup command begin 2.46: --archive-check --no-archive-copy --backup-standby --exec-id=6239-f33ec813 --log-level-console=info --log-level-file=detail --log-path=/var/log/pgbackrest --pg1-host=192.168.100.101 --pg2-host=192.168.100.102 --pg3-host=192.168.100.103 --pg1-path=/var/lib/postgresql/15/main --pg2-path=/var/lib/postgresql/15/main --pg3-path=/var/lib/postgresql/15/main --pg1-port=5432 --pg2-port=5432 --pg3-port=5432 --repo1-block --repo1-bundle --repo1-path=/var/lib/pgbackrest --repo1-retention-archive=4 --repo1-retention-full=4 --repo1-type=posix --no-resume --stanza=postgres-cluster --start-fast --stop-auto --type=full
2023-07-05 17:40:21.739 P00   INFO: execute non-exclusive backup start: backup begins after the requested immediate checkpoint completes
2023-07-05 17:40:22.545 P00   INFO: backup start archive = 000000010000000000000006, lsn = 0/6000028
2023-07-05 17:40:22.546 P00   INFO: wait for replay on the standby to reach 0/6000028
2023-07-05 17:40:23.560 P00   INFO: replay on the standby reached 0/6000028
2023-07-05 17:40:23.560 P00   INFO: check archive for prior segment 000000010000000000000005
ERROR: [082]: WAL segment 000000010000000000000005 was not archived before the 60000ms timeout
       HINT: check the archive_command to ensure that all options are correct (especially --stanza).
       HINT: check the PostgreSQL server log for errors.
       HINT: run the 'start' command if the stanza was previously stopped.
2023-07-05 17:41:23.561 P00   INFO: backup command end: aborted with exception [082]
vitabaks commented 10 months ago

Please check archive_command

psql -c "show archive_command"

vitabaks commented 10 months ago

On dedicated pgbackrest server and database server:

pgbackrest --stanza=postgres-cluster info

chuegel commented 10 months ago

Thanks for your reply:

postgres@patroni1:~$ psql -c "show archive_command"
                   archive_command
------------------------------------------------------
 pgbackrest --stanza=postgres-cluster archive-push %p
(1 row)

postgres@patroni1:~$ pgbackrest --stanza=postgres-cluster info
stanza: postgres-cluster
    status: error (missing stanza path)
postgres@patroni-backup:~$ pgbackrest --stanza=postgres-cluster info
WARN: configuration file contains invalid option 'include-path'
stanza: postgres-cluster
    status: error (no valid backups)
    cipher: none

    db (current)
        wal archive min/max (15): none present
vitabaks commented 10 months ago

missing stanza path

Please attach the pgbackrest configuration files from the dedicated pgbackrest and database servers.

vitabaks commented 10 months ago

I can assume that you left an example for local backups and not for using a dedicated backup server

this is what the configuration for a database server should look like if you are using a dedicated server

pgbackrest_conf:
  global:  # [global] section
    - { option: "log-level-file", value: "detail" }
    - { option: "log-path", value: "/var/log/pgbackrest" }
    - { option: "repo1-host", value: "{{ pgbackrest_repo_host }}" }
    - { option: "repo1-host-user", value: "{{ pgbackrest_repo_user }}" }
    - { option: "repo1-type", value: "{{ pgbackrest_repo_type |lower }}" }
    - { option: "repo1-path", value: "/var/lib/pgbackrest" }

please note that the 'repo1-host' and 'repo1-host-user' options have been added here and the backup configuration options have been removed since now it is a client for a dedicated backup server.

vitabaks commented 10 months ago

I have automated the verification and configuration of these important parameters

PR https://github.com/vitabaks/postgresql_cluster/pull/399

chuegel commented 10 months ago

I can assume that you left an example for local backups and not for using a dedicated backup server

this is what the configuration for a database server should look like if you are using a dedicated server

pgbackrest_conf:
  global:  # [global] section
    - { option: "log-level-file", value: "detail" }
    - { option: "log-path", value: "/var/log/pgbackrest" }
    - { option: "repo1-host", value: "{{ pgbackrest_repo_host }}" }
    - { option: "repo1-host-user", value: "{{ pgbackrest_repo_user }}" }
    - { option: "repo1-type", value: "{{ pgbackrest_repo_type |lower }}" }
    - { option: "repo1-path", value: "/var/lib/pgbackrest" }

please note that the 'repo1-host' and 'repo1-host-user' options have been added here and the backup configuration options have been removed since now it is a client for a dedicated backup server.

Yes, these are my config parameters:

# pgBackRest
pgbackrest_install: true  # or 'true'
pgbackrest_install_from_pgdg_repo: true  # or 'false'
pgbackrest_stanza: "{{ patroni_cluster_name }}"  # specify your --stanza
pgbackrest_repo_type: "posix"  # or "s3", "gcs", "azure"
pgbackrest_repo_host: "192.168.100.104"  # dedicated repository host (optional)
pgbackrest_repo_user: "postgres"
pgbackrest_conf_file: "/etc/pgbackrest/pgbackrest.conf"
# see more options https://pgbackrest.org/configuration.html
pgbackrest_conf:
  global:  # [global] section
    - { option: "log-level-file", value: "detail" }
    - { option: "log-path", value: "/var/log/pgbackrest" }
#    - { option: "repo1-host", value: "{{ pgbackrest_repo_host }}" }
#    - { option: "repo1-host-user", value: "{{ pgbackrest_repo_user }}" }
    - { option: "repo1-type", value: "{{ pgbackrest_repo_type |lower }}" }
    - { option: "repo1-path", value: "/var/lib/pgbackrest" }
    - { option: "repo1-retention-full", value: "4" }
    - { option: "repo1-retention-archive", value: "4" }
    - { option: "start-fast", value: "y" }
    - { option: "stop-auto", value: "y" }
    - { option: "resume", value: "n" }
    - { option: "link-all", value: "y" }
#    - { option: "", value: "" }
  stanza:  # [stanza_name] section
    - { option: "pg1-path", value: "{{ postgresql_data_dir }}" }
    - { option: "recovery-option", value: "recovery_target_action=promote" }
    - { option: "log-level-console", value: "info" }
    - { option: "process-max", value: "4" }
#    - { option: "", value: "" }
# (optional) dedicated backup server config (if "repo_host" is set)
pgbackrest_server_conf:
  global:
    - { option: "log-level-file", value: "detail" }
    - { option: "log-level-console", value: "info" }
    - { option: "log-path", value: "/var/log/pgbackrest" }
    - { option: "repo1-type", value: "{{ pgbackrest_repo_type |lower }}" }
    - { option: "repo1-path", value: "/var/lib/pgbackrest" }
    - { option: "repo1-retention-full", value: "4" }
    - { option: "repo1-retention-archive", value: "4" }
    - { option: "repo1-bundle", value: "y" }
    - { option: "repo1-block", value: "y" }
    - { option: "start-fast", value: "y" }
    - { option: "stop-auto", value: "y" }
    - { option: "resume", value: "n" }
    - { option: "link-all", value: "y" }
    - { option: "archive-check", value: "y" }
    - { option: "archive-copy", value: "n" }
    - { option: "backup-standby", value: "y" }
#    - { option: "", value: "" }
# the stanza section will be generated automatically
vitabaks commented 10 months ago

just uncomment these options or use the new version of the playbook from the master branch

chuegel commented 10 months ago

Thanks, that worked

pgbackrest --stanza=postgres-cluster --type=full backup
WARN: configuration file contains invalid option 'include-path'
2023-07-06 18:29:33.976 P00   INFO: backup command begin 2.46: --archive-check --no-archive-copy --backup-standby --exec-id=7035-df813319 --log-level-console=info --log-level-file=detail --log-path=/var/log/pgbackrest --pg1-host=192.168.100.101 --pg2-host=192.168.100.102 --pg3-host=192.168.100.103 --pg1-path=/var/lib/postgresql/15/main --pg2-path=/var/lib/postgresql/15/main --pg3-path=/var/lib/postgresql/15/main --pg1-port=5432 --pg2-port=5432 --pg3-port=5432 --repo1-block --repo1-bundle --repo1-path=/var/lib/pgbackrest --repo1-retention-archive=4 --repo1-retention-full=4 --repo1-type=posix --no-resume --stanza=postgres-cluster --start-fast --stop-auto --type=full
2023-07-06 18:29:39.345 P00   INFO: execute non-exclusive backup start: backup begins after the requested immediate checkpoint completes
2023-07-06 18:29:40.151 P00   INFO: backup start archive = 000000010000000000000005, lsn = 0/5000110
2023-07-06 18:29:40.151 P00   INFO: wait for replay on the standby to reach 0/5000110
2023-07-06 18:29:41.165 P00   INFO: replay on the standby reached 0/5000110
2023-07-06 18:29:41.166 P00   INFO: check archive for prior segment 000000010000000000000004
2023-07-06 18:29:45.190 P00   INFO: execute non-exclusive backup stop and wait for all WAL segments to archive
2023-07-06 18:29:45.392 P00   INFO: backup stop archive = 000000010000000000000005, lsn = 0/50001E8
2023-07-06 18:29:45.395 P00   INFO: check archive for segment(s) 000000010000000000000005:000000010000000000000005
2023-07-06 18:29:46.702 P00   INFO: new backup label = 20230706-182939F
2023-07-06 18:29:46.762 P00   INFO: full backup size = 29.3MB, file total = 1268
2023-07-06 18:29:46.763 P00   INFO: backup command end: completed successfully (12792ms)
2023-07-06 18:29:46.763 P00   INFO: expire command begin 2.46: --exec-id=7035-df813319 --log-level-console=info --log-level-file=detail --log-path=/var/log/pgbackrest --repo1-path=/var/lib/pgbackrest --repo1-retention-archive=4 --repo1-retention-full=4 --repo1-type=posix --stanza=postgres-cluster
2023-07-06 18:29:46.868 P00   INFO: expire command end: completed successfully (105ms)