vitabaks / postgresql_cluster

PostgreSQL High-Availability Cluster (based on "Patroni" and DCS "etcd" or "consul"). Automating with Ansible.
MIT License
1.29k stars 352 forks source link

add support wal-g #32

Closed patsevanton closed 4 years ago

patsevanton commented 4 years ago
patsevanton commented 4 years ago

image

patsevanton commented 4 years ago

Variable from .walg.json i will move to vars/main.yml

patsevanton commented 4 years ago

Do you have variable for PGDATA /var/lib/pgsql/11/data/ ? PGHOST /var/run/postgresql/.s.PGSQL.5432 work only CentOS?

How archive_command wal-g wal-push '%p' move to variable?

vitabaks commented 4 years ago

Do you have variable for PGDATA

See RedHat.yml and Debian.yml variable files

patsevanton commented 4 years ago

Ok. PGDATA /var/lib/pgsql/11/data/ ? PGHOST /var/run/postgresql/.s.PGSQL.5432 move to RedHat.yml

vitabaks commented 4 years ago

templates/.walg.json.j2

"PGDATA": "{{ postgresql_data_dir }}", "PGHOST": "{{ postgresql_unix_socket_dir }}/.s.PGSQL.5432",

And need add the variables for other parameters ".walg.json.j2" in the # WAL-G section (vars/main.yml)

patsevanton commented 4 years ago

{{ postgresql_unix_socket_dir }}/.s.PGSQL.5432 work for Debian?

vitabaks commented 4 years ago

{{ postgresql_unix_socket_dir }}/.s.PGSQL.5432 work for Debian?

Yes. See Debian.yml variable file.

The postgresql port can be changed, specify this: "PGHOST": "{{ postgresql_unix_socket_dir }}/.s.PGSQL.{{ postgresql_port }}",

patsevanton commented 4 years ago
TASK [wal-g | generate conf file .walg.json.j2] ************************************************************************************************
fatal: [172.26.9.198]: FAILED! => {"changed": false, "checksum": "051b2c33a9f921ccc77d4d2ab257afe6c82f512b", "msg": "Unsupported parameters for (copy) module: tags Supported parameters include: _original_basename, attributes, backup, checksum, content, delimiter, dest, directory_mode, follow, force, group, local_follow, mode, owner, regexp, remote_src, selevel, serole, setype, seuser, src, unsafe_writes, validate"}
fatal: [172.26.9.199]: FAILED! => {"changed": false, "checksum": "051b2c33a9f921ccc77d4d2ab257afe6c82f512b", "msg": "Unsupported parameters for (copy) module: tags Supported parameters include: _original_basename, attributes, backup, checksum, content, delimiter, dest, directory_mode, follow, force, group, local_follow, mode, owner, regexp, remote_src, selevel, serole, setype, seuser, src, unsafe_writes, validate"}
fatal: [172.26.9.20]: FAILED! => {"changed": false, "checksum": "051b2c33a9f921ccc77d4d2ab257afe6c82f512b", "msg": "Unsupported parameters for (copy) module: tags Supported parameters include: _original_basename, attributes, backup, checksum, content, delimiter, dest, directory_mode, follow, force, group, local_follow, mode, owner, regexp, remote_src, selevel, serole, setype, seuser, src, unsafe_writes, validate"}

Check tomorrow

vitabaks commented 4 years ago
TASK [wal-g | generate conf file .walg.json.j2] ************************************************************************************************
fatal: [172.26.9.198]: FAILED! => {"changed": false, "checksum": "051b2c33a9f921ccc77d4d2ab257afe6c82f512b", "msg": "Unsupported parameters for (copy) module: tags Supported parameters include: _original_basename, attributes, backup, checksum, content, delimiter, dest, directory_mode, follow, force, group, local_follow, mode, owner, regexp, remote_src, selevel, serole, setype, seuser, src, unsafe_writes, validate"}
fatal: [172.26.9.199]: FAILED! => {"changed": false, "checksum": "051b2c33a9f921ccc77d4d2ab257afe6c82f512b", "msg": "Unsupported parameters for (copy) module: tags Supported parameters include: _original_basename, attributes, backup, checksum, content, delimiter, dest, directory_mode, follow, force, group, local_follow, mode, owner, regexp, remote_src, selevel, serole, setype, seuser, src, unsafe_writes, validate"}
fatal: [172.26.9.20]: FAILED! => {"changed": false, "checksum": "051b2c33a9f921ccc77d4d2ab257afe6c82f512b", "msg": "Unsupported parameters for (copy) module: tags Supported parameters include: _original_basename, attributes, backup, checksum, content, delimiter, dest, directory_mode, follow, force, group, local_follow, mode, owner, regexp, remote_src, selevel, serole, setype, seuser, src, unsafe_writes, validate"}

Check tomorrow

You have an error associated with an extra space for the tag.

Fix:

- name: wal-g | generate conf file .walg.json.j2
  template:
    src: templates/.walg.json.j2
    dest: '/var/lib/pgsql/.walg.json'
    owner: postgres
    group: postgres
    mode: 0644
  tags: wal_g_install
vitabaks commented 4 years ago

You need to create new variable postgresql_restore_command in the file /vars/main.yml for patroni.yml.j2 template in postgresql.recovery_conf section (not for dcs bootstrap)

recovery_conf: additional configuration settings written to recovery.conf when configuring follower.

See https://patroni.readthedocs.io/en/latest/SETTINGS.html#postgresql

If it is difficult for you, I can do it myself.

patsevanton commented 4 years ago

Work on Redhat image

patsevanton commented 4 years ago

You need to create new variable postgresql_restore_command in the file /vars/main.yml for patroni.yml.j2 template in postgresql.recovery_conf section (not for dcs bootstrap)

recovery_conf: additional configuration settings written to recovery.conf when configuring follower.

See https://patroni.readthedocs.io/en/latest/SETTINGS.html#postgresql

If it is difficult for you, I can do it myself.

restore_command = 'wal-g wal-fetch %f %p'

? check tommorow

vitabaks commented 4 years ago

? check tommorow

specify in the patroni.yml.j2 template, below the create_replica_methods parameters

{% if postgresql_restore_command is defined and postgresql_restore_command | length > 0 %}
  recovery_conf:
    restore_command: {{ postgresql_restore_command }}
{% endif %}

in vars/main.yml , bellow the patroni_create_replica_methods basebackup vars:

postgresql_restore_command: [] # written to recovery.conf when configuring follower (create replica)
# postgresql_restore_command: 'wal-g wal-fetch %f %p' # restore WAL-s using WAL-G
patsevanton commented 4 years ago

templates/.walg.json.j2

"PGDATA": "{{ postgresql_data_dir }}", "PGHOST": "{{ postgresql_unix_socket_dir }}/.s.PGSQL.5432",

And need add the variables for other parameters ".walg.json.j2" in the # WAL-G section (vars/main.yml)

Work with

"PGHOST": "{{ postgresql_unix_socket_dir }}",

on CentOS

patsevanton commented 4 years ago

restore_from_backup

bootstrap:
  method: restore_from_backup
  restore_from_backup:
    command: sudo -u postgres wal-g backup-fetch --datadir /val/lib/postgresql/data LATEST
    recovery_conf: 
       recovery_target_action: promote
       recovery_target_timeline: latest
       restore_command: sudo -u postgres wal-g wal-fetch "%f" "%p"

  dcs:
    postgresql:
      use_pg_rewind: true
      create_replica_method:
        - wal_e
        - basebackup
      wal_e:
        command: sudo -u postgres wal-g backup-fetch --datadir /val/lib/postgresql/data LATEST
        no_master: 1

https://github.com/zalando/patroni/issues/636

How add command in restore_from_backup to playbook?

patsevanton commented 4 years ago

postgresql_restore_command

Created postgresql_restore_command

vitabaks commented 4 years ago

How add command in restore_from_backup to playbook?

This is another task...

I propose to implement bootstrap cluster from wal-g backup after we implement create_replica_method wal-g.

patsevanton commented 4 years ago

If set wal_g_install to true, then wal-g have been simple installed. image

vitabaks commented 4 years ago

If set wal_g_install to true, then wal-g have been simple installed.

It is correct. This matches your condition: when: wal_g_install == "true"

patsevanton commented 4 years ago

Error: unknown flag: --no_master

Mar 31 02:21:42 walg-apatsev-2 systemd: Started Runners to orchestrate a high-availability PostgreSQL - patroni.
Mar 31 02:21:42 walg-apatsev-2 patroni: 2020-03-31 02:21:42,299 INFO: Selected new etcd server http://172.26.9.198:2379
Mar 31 02:21:42 walg-apatsev-2 patroni: 2020-03-31 02:21:42,307 INFO: No PostgreSQL configuration items changed, nothing to reload.
Mar 31 02:21:42 walg-apatsev-2 patroni: 2020-03-31 02:21:42,371 INFO: trying to bootstrap from leader 'pgnode01'
Mar 31 02:21:42 walg-apatsev-2 patroni: Error: unknown flag: --no_master
Mar 31 02:21:42 walg-apatsev-2 patroni: Usage:
Mar 31 02:21:42 walg-apatsev-2 patroni: wal-g backup-fetch destination_directory backup_name [flags]
Mar 31 02:21:42 walg-apatsev-2 patroni: Flags:
Mar 31 02:21:42 walg-apatsev-2 patroni: -h, --help                  help for backup-fetch
Mar 31 02:21:42 walg-apatsev-2 patroni: --mask string           Fetches only files which path relative to destination_directory
Mar 31 02:21:42 walg-apatsev-2 patroni: matches given shell file pattern.
Mar 31 02:21:42 walg-apatsev-2 patroni: For information about pattern syntax view: https://golang.org/pkg/path/filepath/#Match
Mar 31 02:21:42 walg-apatsev-2 patroni: --restore-spec string   Path to file containing tablespace restore specification
Mar 31 02:21:42 walg-apatsev-2 patroni: Global Flags:
Mar 31 02:21:42 walg-apatsev-2 patroni: --config string   config file (default is $HOME/.walg.json)
Mar 31 02:21:42 walg-apatsev-2 patroni: unknown flag: --no_master
Mar 31 02:21:42 walg-apatsev-2 patroni: 2020-03-31 02:21:42,401 ERROR: Error creating replica using method wal_g: wal-g backup-fetch /var/lib/pgsql/11/data LATEST exited with code=1
Mar 31 02:21:43 walg-apatsev-2 patroni: 2020-03-31 02:21:43,497 INFO: replica has been created using basebackup
Mar 31 02:21:43 walg-apatsev-2 patroni: 2020-03-31 02:21:43,498 INFO: bootstrapped from leader 'pgnode01'
Mar 31 02:21:43 walg-apatsev-2 patroni: 2020-03-31 02:21:43,729 INFO: postmaster pid=27686
Mar 31 02:21:43 walg-apatsev-2 patroni: /var/run/postgresql:5432 - no response

image

-bash-4.2$ cat $HOME/.walg.json
{
  "AWS_ACCESS_KEY_ID": "minio",
  "AWS_ENDPOINT": "http://172.26.9.200:9000",
  "AWS_S3_FORCE_PATH_STYLE": "true",
  "AWS_SECRET_ACCESS_KEY": "miniosecret",
  "PGDATA": "/var/lib/pgsql/11/data",
  "PGHOST": "/var/run/postgresql",
  "WALE_S3_PREFIX": "s3://bucket",
  "WALG_COMPRESSION_METHOD": "brotli"
}
vitabaks commented 4 years ago

This is patroni flag.

A special no_master parameter, if defined, allows Patroni to call the replica creation method even if there is no running master or replicas. In that case, an empty string will be passed in a connection string. This is useful for restoring the formerly running cluster from the binary backup.

https://patroni.readthedocs.io/en/latest/replica_bootstrap.html#building-replicas

patsevanton commented 4 years ago

Create replicas using wal-g


patroni: ERROR: 2020/03/31 11:13:13.071549 Failed to fetch backup: No backups found
patroni: 2020-03-31 11:13:13,075 ERROR: Error creating replica using method wal_g: wal-g backup-fetch /var/lib/pgsql/11/data LATEST exited with code=1
patroni: 2020-03-31 11:13:14,474 INFO: replica has been created using basebackup
patroni: 2020-03-31 11:13:14,475 INFO: bootstrapped from leader 'pgnode01'```
vitabaks commented 4 years ago

Failed to fetch backup: No backups found

create backup and try again.

vitabaks commented 4 years ago

https://github.com/zalando/patroni/issues/1475

vitabaks commented 4 years ago

It remains to perform additional tests.

patsevanton commented 4 years ago

Bootstrap FROM wal-g backup - successful.

Apr 13 05:21:07 walg-apatsev-1 systemd: Started Runners to orchestrate a high-availability PostgreSQL - patroni.
Apr 13 05:21:07 walg-apatsev-1 patroni: 2020-04-13 05:21:07,658 INFO: Selected new etcd server http://172.26.9.198:2379
Apr 13 05:21:07 walg-apatsev-1 patroni: 2020-04-13 05:21:07,665 INFO: No PostgreSQL configuration items changed, nothing to reload.
Apr 13 05:21:07 walg-apatsev-1 patroni: 2020-04-13 05:21:07,776 INFO: trying to bootstrap a new cluster
Apr 13 05:21:07 walg-apatsev-1 patroni: 2020-04-13 05:21:07,776 INFO: Running custom bootstrap script: /etc/patroni/wal_g_bootstrap.sh
Apr 13 05:21:07 walg-apatsev-1 patroni: INFO: 2020/04/13 05:21:07.809582 LATEST backup is: 'base_000000010000000000000006'
Apr 13 05:21:07 walg-apatsev-1 patroni: INFO: 2020/04/13 05:21:07.869223 Finished decompression of part_003.tar.br
Apr 13 05:21:07 walg-apatsev-1 patroni: INFO: 2020/04/13 05:21:07.869242 Finished extraction of part_003.tar.br
Apr 13 05:21:17 walg-apatsev-1 patroni: 2020-04-13 05:21:17,671 INFO: not healthy enough for leader race
Apr 13 05:21:17 walg-apatsev-1 patroni: 2020-04-13 05:21:17,683 INFO: bootstrap in progress
Apr 13 05:21:21 walg-apatsev-1 patroni: INFO: 2020/04/13 05:21:21.344271 Finished extraction of part_001.tar.br
Apr 13 05:21:21 walg-apatsev-1 patroni: INFO: 2020/04/13 05:21:21.344816 Finished decompression of part_001.tar.br
Apr 13 05:21:21 walg-apatsev-1 patroni: INFO: 2020/04/13 05:21:21.378345 Finished decompression of pg_control.tar.br
Apr 13 05:21:21 walg-apatsev-1 patroni: INFO: 2020/04/13 05:21:21.378363 Finished extraction of pg_control.tar.br
Apr 13 05:21:21 walg-apatsev-1 patroni: INFO: 2020/04/13 05:21:21.378372
Apr 13 05:21:21 walg-apatsev-1 patroni: Backup extraction complete.
patsevanton commented 4 years ago

Create replica FROM wal-g backup - successful.

Apr 13 05:21:33 walg-apatsev-2 systemd: Started Runners to orchestrate a high-availability PostgreSQL - patroni.
Apr 13 05:21:33 walg-apatsev-2 patroni: 2020-04-13 05:21:33,893 INFO: Selected new etcd server http://172.26.9.198:2379
Apr 13 05:21:33 walg-apatsev-2 patroni: 2020-04-13 05:21:33,903 INFO: No PostgreSQL configuration items changed, nothing to reload.
Apr 13 05:21:33 walg-apatsev-2 patroni: INFO: 2020/04/13 05:21:33.991714 LATEST backup is: 'base_000000010000000000000006'
Apr 13 05:21:34 walg-apatsev-2 patroni: INFO: 2020/04/13 05:21:34.055593 Finished decompression of part_003.tar.br
Apr 13 05:21:34 walg-apatsev-2 patroni: INFO: 2020/04/13 05:21:34.055621 Finished extraction of part_003.tar.br
Apr 13 05:21:45 walg-apatsev-2 systemd: [/etc/systemd/system/patroni.service:14] Not an absolute path, ignoring: ~
Apr 13 05:21:45 walg-apatsev-2 ansible-lineinfile: Invoked with directory_mode=None force=None remote_src=None backrefs=False insertafter=None path=/etc/environment owner=root follow=False line=export PATRONICTL_CONFIG_FILE=/etc/patroni/patroni.yml group=root insertbefore=None unsafe_writes=None create=False state=present content=NOT_LOGGING_PARAMETER serole=None setype=None dest=/etc/environment selevel=None regexp=^export PATRONICTL_CONFIG_FILE validate=None src=None seuser=None delimiter=None mode=0644 firstmatch=False attributes=None backup=False
Apr 13 05:21:49 walg-apatsev-2 patroni: INFO: 2020/04/13 05:21:49.348950 Finished extraction of part_001.tar.br
Apr 13 05:21:49 walg-apatsev-2 patroni: INFO: 2020/04/13 05:21:49.349698 Finished decompression of part_001.tar.br
Apr 13 05:21:49 walg-apatsev-2 patroni: INFO: 2020/04/13 05:21:49.387463 Finished decompression of pg_control.tar.br
Apr 13 05:21:49 walg-apatsev-2 patroni: INFO: 2020/04/13 05:21:49.387481 Finished extraction of pg_control.tar.br
Apr 13 05:21:49 walg-apatsev-2 patroni: INFO: 2020/04/13 05:21:49.387491
Apr 13 05:21:49 walg-apatsev-2 patroni: Backup extraction complete.
Apr 13 05:21:49 walg-apatsev-2 patroni: 2020-04-13 05:21:49,392 INFO: replica has been created using wal_g