YanChii / ansible-role-postgres-ha

Create postgresql HA auto-failover cluster using pcs, pacemaker and PAF
Apache License 2.0
33 stars 22 forks source link

postgres-ha

With this role, you will transform your standalone postgresql server to N-node postgres cluster with automated failover. You only need one working postgresql server and other hosts with clean CentOS 7 or CentOS 6 minimal install.

Alternatively, this role can create a database cluster for you from scratch. If no postgres database is detected, it will be created.

What it will do:

Automated failover is setup using PAF pacemaker module: https://github.com/dalibo/PAF

What you should know

Requirements

This role works on CentOS 6 and 7. RHEL was not tested but should work without problem. If you need support for other distribution, I can help. Post an issue.

The postgresql binaries on your primary server should be installed from the official repository:

https://yum.postgresql.org/repopackages.php

Note: If you have binaries from other repo, you need to modify the postgres_ha_repo_url variable to change the postgres repository source and maybe also bindir and datadir paths in other role variables. If you need to change the installed package name(s), you need to directly modify install pg* task in tasks/postgresql_sync.yml file.

Role Variables

For all variables with description see defaults/main.yml

Variables that must be changed:

Password hash can be generated for example by this command:

python -c 'import crypt; print(crypt.crypt("my_cluster_ha_password", crypt.mksalt(crypt.METHOD_SHA512)))'

Dependencies

No other roles are required as a dependency. However you can combine this role with some other role that installs a postgresql database.

Example Playbook

The usage is relatively simple - install minimal CentOS-es, set the variables and run the role.

Two settings are required:

    - name: install PG HA
      hosts: db?
      gather_facts: True
      any_errors_fatal: True
      vars:
        postgres_ha_cluster_master_host: db1
        postgres_ha_cluster_vip: 10.10.10.10
        postgres_ha_pg_repl_pass: MySuperSecretDBPass
        postgres_ha_cluster_ha_password: AnotherSuperSecretPass1234
        postgres_ha_cluster_ha_password_hash: '$6$mHeZ7/LD1y.........7VJYu.'
      pre_tasks:
        - name: disable firewall
          service: name=firewalld state=stopped enabled=no
      roles:
         - postgres-ha

Cleanup after failure

If the role fails repeatedly and you want to run it fresh as if it was the first time, you need to clean up some things. Please note that default resource names are used here. If you change them using variables, you need to change it also in these commands.

Make sure no (related) cluster resources are defined.

- RUN ON ALL SLAVE NODES:

systemctl stop postgresql-9.6

Make sure no postgres db is running.

systemctl status postgresql-9.6 ps aux | grep postgres rm -rf /var/lib/pgsql/9.6/data rm -f /var/lib/pgsql/9.6/recovery.conf.pgcluster.pcmk rm -f /var/lib/pgsql/9.6/.*_constraints_processed # name generated from postgres_ha_cluster_pg_res_name

- RUN ONLY ON MASTER NODE:

systemctl stop postgresql-9.6 rm -f /var/lib/pgsql/9.6/recovery.conf.pgcluster.pcmk rm -f /var/lib/pgsql/9.6/.*_constraints_processed rm -f /var/lib/pgsql/9.6/data/recovery.conf rm -f /var/lib/pgsql/9.6/data/.synchronized

Make sure no postgres db is running.

ps aux | grep postgres systemctl start postgresql-9.6 systemctl status postgresql-9.6

Check postgres db functionality.

- START AGAIN

Check variables & defaults and run ansible role again.



License
-------

BSD

Author Information
------------------

Created by YanChi.

Originally part of the Danube Cloud project (https://github.com/erigones/esdc-ce).