Backup of all storage containers (or data)

knutole commented 8 years ago

We need to backup all critical data at all times.

Critical data

postgis_store_dev (contains all geo-data) (see https://github.com/systemapic/wu/issues/280)
mongo_store_dev (contains models for portal: projects, users, etc.)
redis_store_dev (contains layers for tileserver) 4 redis_stats_store_dev (contains stats)
dev_store_dev_common (contains files, eg. rendered tiles, etc.)

Question is how to do this best:

is it better to do this on a docker-container level?
or using the databases in-built replication etc.

strk commented 8 years ago

As of commit 06ce5614054a8d5398cfc71d3f8fa5f3c6a0705d the "postgis" docker image includes scripts to perform a soft upgrade on all existing databases and to create and restore databases from a set of dumps.

Details can be found in a README there: https://github.com/systemapic/docker-systemapic/blob/master/build/postgis/README.md

The restore-from-dump script is also automatically run by the default entrypoint script (start.sh) IFF a specific env variable is set, pointing to a backup directory. This is used by the do_restore.sh script documented in the backup/postgis docker: https://github.com/systemapic/docker-systemapic/tree/master/build/backup/postgis

So the "hard upgrade" (restore from dumps) support can be used to upgrade a cluster to a later PostgreSQL version and possibly also at the same time to a later PostGIS version in it. Practically, it can be used to restore dumps into a new postgis docker with any combination of versions.

The current "postgis" Dockerfile accepts a build argument to specify the PostgreSQL version. It could be updated to add more build arguments, eventually.

Right now the last dump performed by the backup/postgis container was restored into two new stores on dev2.systemapic.com (mx): postgresql93_store_dev2 (to be used with systemapic/postgis:93-21) and postgresql94_store_dev2 (to be used with systemapic/postgis:94-21). The current postgis service there is played by systemapic/postgis:94-21 using postgresql94_store_dev2 (but change in docker-compoose.yml was not committed yet).

knutole commented 8 years ago

I have a suggestion. I'm starting to think it's better if we make an independent restore_postgis_backup image that will do the following:

will run independently, not connected with anything in docker-compose.yml
connect (--volumes-from) with two stores: one fresh store_postgis_fresh, and one store_postgis_backup containing backup
when we need to restore, we simply run this image once. this will restore backup from store_postgis_backup into store_postgis_fresh
we can then simply swap the new store_postgis_fresh with whatever is in docker-compose for postgis.

So whenever we need to restore, we simply run the restore container, and get a freshly restored backup that we can connect in docker-compose afterwards.

The reason for this suggestion, is that it gets messy in docker-compose with the ENV vars. I mean, we have to start the whole compose once with ENV SYSTEMAPIC_RESTORE_POSTGIS_FROM=pgbk_test2 - but then what? We have to restart whole compose again to remove the ENV. With a separate process for restoring, we can restart compose once (or soon with new upgrades to Docker, we can probably switch stores without restarting at all).

Could be easily put in a script, with to/from args. I mean, we almost have this working now, simply adding the ENV to systemapic/postgis:latest and that will (almost) work. For example:

#!/bin/bash

# Usage: restore_to_fresh.sh store_postgis_backup store_postgis_fresh
BACKUP_STORE=$1
FRESH_STORE=$2

echo "Restoring $BACKUP_STORE into $FRESH_STORE"
docker run -it --volumes-from $BACKUP_STORE --volumes-from $FRESH_STORE -e SYSTEMAPIC_RESTORE_POSTGIS_FROM=$BACKUPSTORE systemapic/postgis:backup

echo "Done! Connect restored volume $FRESH_STORE in docker-compose."

Then, in theory, the store_postgis_fresh can be connected in docker-compose and should be identical to pre-crash backup.

What do you think? Will it work?

knutole commented 8 years ago

@strk Also, would it be possible to, instead of having a backup container, to simply connect two volumes (store_postgis, store_backup) to postgis container and keep data in one and backup in another?

(Btw, would it be possible to simply rsync between the two /var/lib/postgresql/9.4/main/ folders in each container? Or is a proper dump preferred?)

I know you must be laughing (or crying) now, remembering this was your initial idea, to not have a backup image! But if this is possible - without any catches I'm not aware about - what do you think about it?

I know this is a bit last-minute, I mean, we want to move on. At the same time, most of the heavy lifting is done and just a few scripts here and there should do the trick, and it will simplify our setup. It's compatible with the suggestion above as far as I can see.

Do you see any problems with this approach?

strk commented 8 years ago

First thing: we already have a script that does what you want to do here:

when we need to restore, we simply run this image once. this will restore backup from store_postgis_backup into store_postgis_fresh

The above is what the current do_restore.sh script does: https://github.com/systemapic/docker-systemapic/blob/master/build/backup/postgis/restore/do_restore.sh

To do what you mention above, you'd call it like this (from the host system):

 ./do_restore.sh store_postgis_backup /backup/postgis/postgis-backup-last systemapic/postgis:94-21

Is that good enough ? Note it doesn't take a separate image to do that, but rather you specify the name of an existing image to find the pg_dump command in. This lets you restore into a any new PostgreSQL data directory (for example to upgrade from 9.3 to 9.5).

Second, for backup:

proper dump is the only way to be able to restore into an arbitrary new version of postgresq/postgis (even major versions)
running the backup cron within the same container having postgis is ok, except that you'd then have to restart the whole postgis service in case you want to change something in backup configuration (right now schedule is part of the image)
rsync (or similar) is not a replacement for proper dump but an additional service, more for high availability/hot swap than anything else (it'd be an incremental "backup").

knutole commented 8 years ago

Still TODO

[x] double-check backup/restore is working for @knutole
[ ] refactor backup flow; instead of container in compose, dump from postgis image directly.

systemapic / docker-systemapic

Backup of all storage containers (or data) #7

Critical data

Still TODO