ucd-library / pg-farm-v0

UC Davis PostgreSQL Farm
MIT License
0 stars 0 forks source link

Overview

Replicate Types

Streaming replication

Nightly Replication

Standalone/Snapshot

Streaming Replicate Setup

Two user accounts must be created a replication account and a read-only library user account.

Example to create replication account:

CREATE ROLE library_replicate WITH REPLICATION PASSWORD 'password' LOGIN

Then create the library user role

Additional Notes:

pg_hba.conf

host    replication     library_replicate      pg.farm.ip/32    md5

postgresql.conf

wal_level = replica
wal_log_hints = on
max_wal_senders = 3
wal_keep_segments = 8
hot_standby = on

Snapshot Setup

Create libary_user role

Create the library user account and provice the proper access

CREATE ROLE library_user WITH PASSWORD 'library_user' LOGIN
GRANT CONNECT ON DATABASE db_name TO 'library_user';
GRANT USAGE ON SCHEMA schema_name TO 'library_user';
GRANT SELECT ON ALL TABLES IN SCHEMA schema_name TO 'library_user';
GRANT EXECUTE ON SCHEMA schema_name TO 'library_user';

pg_hba.conf

host    db_name         library_user           0.0.0.0/0        md5

Statistics

https://www.postgresql.org/docs/12/monitoring-stats.html

Currently tracking; track_activities, track_counts, track_functions for snapshot and streaming replicate.

Required flags posgtesql.conf

track_activities = on
track_counts = on
track_functions = all   

Example:

 select * from pg_stat_user_functions;

SSL

Need to create certs for domains. Then follow: https://www.postgresql.org/docs/12/ssl-tcp.html

Required flags posgtesql.conf

ssl = on
ssl_prefer_server_ciphers = on
ssl_cert_file = 'server.crt'
ssl_key_file = 'server.key'

Using PG SSL without Letsencrypt https://medium.com/@pavelevstigneev/postgresql-ssl-with-letsencrypt-b53051eacc22

The short:

cp -L /etc/letsencrypt/live/$DOMAIN/fullchain.pem $FARM_DIR/server.crt
cp -L /etc/letsencrypt/live/$DOMAIN/privkey.pem   $FARM_DIR/server.key

Rest API

PostgREST (PGR) is used to allow RESTful HTTP API access.

Setup

The library user role should be user to access the posgrest instance.

Backups

Two different backups will be performed. The first will be a full snapshot of /var/lib/postgres/data capturing the entire state of the database. Because this might contain private (but still S2 level) data, a secondary backup will be generated using the library_user account. This library_user snapshot will be fully accessible to the public.

Backups will be created (nightly for streaming replicates?) by the controller container. All controller containers have a /pg-stage mount point where database dumbs can be stored when backuping up/restoring the database.

AWS S3 Backups

AWS S3 will be used to store backups. A local .aws-credentials file will be mounted to /root/.aws/credentials provided the container access to the S3 bucket.

Database backups will be stored in the S3 bucket defined the in farms config.json file. Inside the bucket, backups will be stored in the folder with the same name as the farms cluster name. Each cluster folder will have two files: backup.zip and public.zip

Example location: s3://pg-farm/my-project/backup.zip

AWS CLI key rotation

The clusters access S3 buckets via the AWS CLI which use Service Accounts associated access key id and access key secret to authenticate. It is recommended you rotate these keys every month or so. The PG Farm CLI as a built in helper function to rotate the access keys for the entire farm.

pg-farm rotate-keys --key-id [id] --key-secret [secret]

Setup

This section contains information on setting up pg-farm and accessing via CLI.

Farm Setup

Setting up pg-farm requires Docker, Docker Compose and the pg-farm cli.

Install

Configure

Choose a root directory to host the farm. For this example we will use /opt/pg-farm.

mkdir /opt/pg-farm

In /opt/pg-farm create a config.json file. It should have the following properties

{ 
  "name": "test-farm", // give the farm a name
  "domain": "localhost", // domain the farm will be hosted at
  "aws": {
    "key_id": "", // aws service account id
    "key_secret": "", // aws service account secret
    "bucket": "pg-farm" // aws s3 bucket for the farm to use
  },
  "certs": {
    "crt" : "", // path to server certificate file
    "key" : "", // path to server key file
  },
  "options": {
    "startPort": 6000, // port number pg-farm will start allocating ports at
    "apache": false // should pg-farm create apache config files for clusters in /etc/apache2/site-available
  }
}

CLI Setup

Install

npm install -g @ucd-lib/pg-farm

Configure

Now create a .pg-farm options file in your home directory and set:

{
  "rootDir" : "/opt/pg-farm"
}

This will set the default location for the farm. You can override this location in the CLI by:

Additonaly you can create an options file /etc/pg-farm/conf. These options will always be used when you run the CLI on your system. Though a local file in your home directory will override any options set in the global file. PG_FARM_ROOT and --root-dir will overwrite the rootDir variable in any config file. Use pg-farm config list to see which files/options are in use.