bcgov / cas-postgres

Postgres container image intended for OpenShift and general usage
Apache License 2.0
2 stars 0 forks source link
cas-ggircs env

cas-postgres

CircleCI

This repository contains the Spilo + pgtap docker container and helm chart used by the various applications deployed by the CAS team.

Docker container

The docker container adds the pgtap extension (unit testing for PostgreSQL) and a couple of utility scripts to the official Spilo container developed by Zalando.

BATS tests for the aforementioned scripts can be run using the corresponding make commands (run make help for more information).

Helm Charts

The helm/patroni folder contains a fork of the patroni helm chart that was in the (now deprecated) helm charts incubator, and can be reused by other teams. The helm/cas-postgres folder contains helm values used by the CAS team.

In order to use WALG backups, you will need to pass in storage bucket names and service accounts that have already been provisioned. You can do this utilizing the terraform-bucket-provision chart from cas-pipeline.

Installation

Installation of the cas-postgres helm chart is simply done via the helm install command. See helm/cas-postgres/values.yaml for more information on the values that should be provided

Steps for a complete fresh install (will lose all data)

Point in Time Recovery

In the event a database needs to be recovered from a backup to a specific point in time, these are the steps to follow. Spilo is shipped with a clone_with_wale method that we can leverage to handle this for us with the addition of a few environment variables.

These steps assume you are using Google Cloud to store backups. If you are using something else (like S3), the process should be similar for other providers and the necessary environment variables are described in the configure_spilo script here.

Steps

Patroni will then start up your database and restore to the point defined in CLONE_TARGET_TIME.

If successful you will see something like this in your logs: (docs/images/success.png)

Patroni will begin bootstrapping your backup. You will see a lot "Not Healthy..bootstrap in progress..rejecting connections" in your logs for a while as shown below. This is normal, do not stop the process. (docs/images/bootstrapping.png) This will repeat for a few moments. Once the bootstrap is finished you'll see a bunch of SQL statements followed eventually by your pod declaring it is the leader with the lock. At this point you're all done!

Command-Line Walkthrough

Esure you are in the correct namespace!

  1. oc scale statefulset <statefulset-name> --replicas=0

    • This will scale your statefulset to 0
    • The statefulset name is probably the same as your patroni cluster
    • To get a list of all statefulsets use oc get statefulsets
  2. oc delete configmap <configmap-name>

    • You'll need to do this for each configmap prefixed by the name of your patroni cluster
    • example: \<patroni-cluster>-leader, \<patroni-cluster>-config, \<patroni-cluster>-failover
    • To get a list of all configmaps use oc get configmaps
  3. oc delete pvc <pvc-name>

    • Again, you'll need to do this for each PVC that contains the name of your patroni cluster
    • example: storage-volume-\<patroni-cluster>-0, storage-volume-\<patroni>-1
    • To get a list of all PVCs use oc get pvc
  4. oc set env statefulset/<statefulset-name> CLONE_SCOPE=<patroni-cluster-name> CLONE_METHOD=CLONE_WITH_WALE CLONE_WALG_GS_PREFIX=<google-cloud-prefix> CLONE_GOOGLE_APPLICATION_CREDENTIALS=<path-to-credentials> CLONE_TARGET_TIME='<target-restore-time-with-timezone>' PGVERSION=<major-postgres-version-to-restore-with>

    • To get a list of all environment variables in your statefulset use oc set env statefulset/<statefulset-name> --list
    • Examples of values between< > are above in the steps section
  5. oc scale statefulset <statefulset-name> --replicas=<n> with 'n' being desired number of pods.

Openshift UI Walkthrough
  1. Scale your statefulset to 0

    • Under Workloads in the left pane, go to StatefulSets (docs/images/ss.png)
    • Use the arrows to scale your pods down to 0 (docs/images/scaled-down.png)
  2. Delete configmaps

    • Under Workloads in the left pane, go to ConfigMaps (docs/images/config.png)
    • Delete all configmaps in the list that are prefixed by your patroni cluster name
    • example: \<patroni-cluster>-leader, \<patroni-cluster>-config, \<patroni-cluster>-failover
  3. Delete PVCs

    • Under Storage in the left pane, go to PersistentVolumeClaims (docs/images/pvc.png)
    • Delete all PVCs in the list that contain your patroni cluster name
    • example: storage-volume-\<patroni-cluster>-0, storage-volume-\<patroni>-1
  4. Set environment variables

    • Under Workloads in the left pane, go back to StatefulSets (docs/images/ss.png)
    • Select your statefulset & go to the Environment tab
    • Add the environment variables described in the Steps section above
    • Save your environment variables with the save button at the bottom of the page
  5. Scale up your statefulset

    • Go to the Details tab & use the arrows to scale up your statefulset