vitabaks / postgresql_cluster

PostgreSQL High-Availability Cluster (based on Patroni). Automating with Ansible.
https://postgresql-cluster.org
MIT License
1.48k stars 394 forks source link

Backup and restore question(pgbackrest) #701

Closed BabaTataKaka closed 1 month ago

BabaTataKaka commented 1 month ago

I'm trying to run a PITR on an existing postgres cluster:

ansible-playbook -i ./inventory/dev.yml deploy_pgcluster.yml --tags point_in_time_recovery

The restore is successful on the master but it fail on the replica server due to the backup not being created on the replicas. I've asked a question on the backrest github(https://github.com/pgbackrest/pgbackrest/issues/2400) about that problem and they told me that you cannot run a backup on a standby server(which I think is required by patroni: hot_standby=on). The solution would be to define pg1(master) and pg2(standby) in the pgbackrest config. The postgres_cluster playbook out of the box settings do not suggest such config and wouldn't make sense with patroni(HA). When doing the PITR the playblook clearly try to restore both the master and replicas but the suggested pgbackrest config for the backup doesn't allow it. Is there something I'm doing wrong?

vitabaks commented 1 month ago

Hi @BabaTataKaka

The provided configuration in vars/main.yml is just an example.

For PITR of the entire cluster, it is necessary that the backups are not local but stored on a dedicated pgBackRest server (see pgbackrest_repo_host), S3, or similar storage (see pgbackrest_repo_type).

vitabaks commented 1 month ago

example of a configuration with a dedicated pgBackRest server:

# pgBackRest
pgbackrest_install: true
pgbackrest_install_from_pgdg_repo: true
pgbackrest_stanza: "{{ patroni_cluster_name }}"
pgbackrest_repo_type: "posix"
pgbackrest_repo_host: "***.**.***.**"  # dedicated repository host
pgbackrest_repo_user: "postgres"
pgbackrest_conf_file: "/etc/pgbackrest/pgbackrest.conf"
# Client (database server) config
pgbackrest_conf:
  global:
    - { option: "log-level-file", value: "detail" }
    - { option: "log-path", value: "/var/log/pgbackrest" }
    - { option: "repo1-host", value: "{{ pgbackrest_repo_host }}" }
    - { option: "repo1-host-user", value: "{{ pgbackrest_repo_user }}" }
    - { option: "repo1-type", value: "{{ pgbackrest_repo_type | lower }}" }
    - { option: "repo1-path", value: "/var/lib/pgbackrest" }
    - { option: "spool-path", value: "/var/spool/pgbackrest" }
    - { option: "archive-async", value: "y" }
    - { option: "archive-get-queue-max", value: "1GiB" }
  stanza:
    - { option: "process-max", value: "4" }
    - { option: "log-level-console", value: "info" }
    - { option: "recovery-option", value: "recovery_target_action=promote" }
    - { option: "pg1-socket-path", value: "{{ postgresql_unix_socket_dir }}" }
    - { option: "pg1-path", value: "{{ postgresql_data_dir }}" }
# Server config
pgbackrest_server_conf:
  global:
    - { option: "process-max", value: "4" }
    - { option: "log-level-file", value: "detail" }
    - { option: "log-level-console", value: "info" }
    - { option: "log-path", value: "/var/log/pgbackrest" }
    - { option: "repo1-type", value: "{{ pgbackrest_repo_type | lower }}" }
    - { option: "repo1-path", value: "/var/lib/pgbackrest" }
    - { option: "repo1-retention-full", value: "30" }
    - { option: "repo1-retention-full-type", value: "time" }
    - { option: "repo1-retention-history", value: "365" }
    - { option: "repo1-bundle", value: "y" }
    - { option: "repo1-block", value: "y" }
    - { option: "start-fast", value: "y" }
    - { option: "stop-auto", value: "y" }
    - { option: "resume", value: "n" }
    - { option: "link-all", value: "y" }
    - { option: "archive-check", value: "y" }
    - { option: "archive-copy", value: "n" }
    - { option: "backup-standby", value: "y" }
# the stanza section will be generated automatically

In order for automation to configure the pgbackrest server, its address must be specified in the inventory in the "pgbackrest" group.

BabaTataKaka commented 1 month ago

example of a configuration with a dedicated pgBackRest server:

# pgBackRest
pgbackrest_install: true
pgbackrest_install_from_pgdg_repo: true
pgbackrest_stanza: "{{ patroni_cluster_name }}"
pgbackrest_repo_type: "posix"
pgbackrest_repo_host: "***.**.***.**"  # dedicated repository host
pgbackrest_repo_user: "postgres"
pgbackrest_conf_file: "/etc/pgbackrest/pgbackrest.conf"
# Client (database server) config
pgbackrest_conf:
  global:
    - { option: "log-level-file", value: "detail" }
    - { option: "log-path", value: "/var/log/pgbackrest" }
    - { option: "repo1-host", value: "{{ pgbackrest_repo_host }}" }
    - { option: "repo1-host-user", value: "{{ pgbackrest_repo_user }}" }
    - { option: "repo1-type", value: "{{ pgbackrest_repo_type | lower }}" }
    - { option: "repo1-path", value: "/var/lib/pgbackrest" }
    - { option: "spool-path", value: "/var/spool/pgbackrest" }
    - { option: "archive-async", value: "y" }
    - { option: "archive-get-queue-max", value: "1GiB" }
  stanza:
    - { option: "process-max", value: "4" }
    - { option: "log-level-console", value: "info" }
    - { option: "recovery-option", value: "recovery_target_action=promote" }
    - { option: "pg1-socket-path", value: "{{ postgresql_unix_socket_dir }}" }
    - { option: "pg1-path", value: "{{ postgresql_data_dir }}" }
# Server config
pgbackrest_server_conf:
  global:
    - { option: "process-max", value: "4" }
    - { option: "log-level-file", value: "detail" }
    - { option: "log-level-console", value: "info" }
    - { option: "log-path", value: "/var/log/pgbackrest" }
    - { option: "repo1-type", value: "{{ pgbackrest_repo_type | lower }}" }
    - { option: "repo1-path", value: "/var/lib/pgbackrest" }
    - { option: "repo1-retention-full", value: "30" }
    - { option: "repo1-retention-full-type", value: "time" }
    - { option: "repo1-retention-history", value: "365" }
    - { option: "repo1-bundle", value: "y" }
    - { option: "repo1-block", value: "y" }
    - { option: "start-fast", value: "y" }
    - { option: "stop-auto", value: "y" }
    - { option: "resume", value: "n" }
    - { option: "link-all", value: "y" }
    - { option: "archive-check", value: "y" }
    - { option: "archive-copy", value: "n" }
    - { option: "backup-standby", value: "y" }
# the stanza section will be generated automatically

In order for automation to configure the pgbackrest server, its address must be specified in the inventory in the "pgbackrest" group.

Thanks for the reply. I'm aware that the a remote repos must be specified. My questions is more in regard to the standby(replicas?) vs primary. Do I have to specify the specify pg1-host, pg2-host in the pgbackrest section? Like the following:

- { option: "pg1-host", value: "pg-1.server.ta" }
- { option: "pg1-path", value: "{{ postgresql_data_dir }}" }
- { option: "pg2-host", value: "pg-2.server.ta" }
- { option: "pg2-path", value: "{{ postgresql_data_dir }}" }
- { option: "pg3-host", value: "pg-3.server.ta" }
- { option: "pg3-path", value: "{{ postgresql_data_dir }}" }

Just want to make sure these are the rights pgbackrest settings for what the playbook is deploying in term of cluster.

vitabaks commented 1 month ago

You don't need to do this with your hands, as automation will do it for you.

When backup-standby set to 'y', standby servers will be automatically added to the stanza section.

BabaTataKaka commented 1 month ago

OK now I understand. When setting backup-standby, the pgbackrest.conf.j2 will create the pg1-host , pg2-host entries. So i guess that answer my question. Sorry, i completely missed that part.

BabaTataKaka commented 1 month ago

Now getting the same error but with 2 warnings:

2024-07-17 20:29:01.072 P00   INFO: backup command begin 2.52.1: --backup-standby --exec-id=83988-da309fc4 --log-level-console=info --log-level-file=detail --log-path=/var/log/pgbackrest --pg2-host=pg-1.server.ta --pg3-host=pg-3.server.ta --pg1-host-ca-file=/etc/postgresql/certs/tls-ca.crt --pg1-host-cert-file=/etc/postgresql/certs/tls.crt --pg1-host-key-file=/etc/postgresql/certs/pkc8.key --pg1-host-type=tls --pg1-path=/var/lib/postgresql/16/main --pg2-path=/var/lib/postgresql/16/main --pg3-path=/var/lib/postgresql/16/main --pg2-port=5432 --pg3-port=5432 --pg1-socket-path=/var/run/postgresql --process-max=4 --repo1-path=/var/lib/pgbackrest --repo1-retention-archive=4 --repo1-retention-full=4 --repo1-type=posix --no-resume --stanza=postgres-cluster --start-fast --stop-auto --type=full

2024-07-17 20:29:01.992 P00   WARN: unable to check pg2: [UnknownError] remote-0 process on 'pg-1.server.ta' terminated unexpectedly [1]

2024-07-17 20:29:02.329 P00   WARN: unable to check pg3: [UnknownError] remote-0 process on 'pg-3.server.ta' terminated unexpectedly [1]

2024-07-17 20:29:02.329 P00  ERROR: [056]: unable to find primary cluster - cannot proceed

                                    HINT: are all available clusters in recovery?

I only accept cert auth connection in my pg_hba file(zero trust). I guess I should configure pgbackrest to auth with certs? However, if I look on pg2 and pg3 server postgres log I see no connection attempt at all. So it might be something else?

vitabaks commented 1 month ago

See the details in the log located at /var/log/pgbackrest

I haven't used pg1-host-type=tls yet (pgBackRest TLS server), only ssh ("Secure Shell" mode). Make sure that you have configured pgBackRest correctly.

This topic goes a little beyond the scope of the cluster deployment question. See the pgBackRest documentation and also this blog, which may be useful if you really need pgBackRest TLS server mode. However, I suspect that you are still interested in SSH mode, just with authentication in PostgreSQL by certificate.

If you are using SSH mode, you do not need to specify certificate paths in the pgBackRest configuration. These parameters are applicable only when using TLS pgBackRest mode. In this case, the entire authentication setup takes place in PostgreSQL and SSH.

BabaTataKaka commented 1 month ago

Decided to create a decidated pgbackrest server and not restore from standby