pshima / consul-snapshot

consul-snapshot is a backup and restore utility for Consul (https://www.consul.io). This is slightly different than some other utilities out there as this runs as a daemon for backups and ships them to S3. Also has integrated monitoring and backup health checks.
Apache License 2.0
116 stars 35 forks source link

consul-snapshot

consul-snapshot is a backup and restore utility for Consul (https://www.consul.io). This is slightly different than some other utilities out there as this runs as a daemon for backups and ships them to S3. consul snapshot in its current state is designed only for disaster recovery scenarios and full restore. There is no support for single key or path based backups at the moment.

This is intended to run under Nomad (https://www.nomadproject.io) and connected to Consul (https://www.consul.io) and registered as a service with health checks. It also runs fine outside of Nomad standalone and can even be used for single backups, however it is designed to run as a daemon.

consul-snapshot runs a small http server that can be used for consul health checks on backup state. Right now if the backup is older than 1 hour it will return 500s to health check requests at /health making it easy for consul health checking. There is no consul service registration as that is expected to be done in the nomad job spec or manually.

consul-snapshot has been used in production since February 2016.

CHANGELOG

Features

Installation

Grab the binary from Releases

consul-snapshot requires go 1.8.3 to build as consul requires 1.8.3.

With go get:

go get github.com/pshima/consul-snapshot

From source:

git clone https://github.com/pshima/consul-snapshot
cd consul-snapshot
make
make install

Configuration

Configuration is done from environment variables.

And through the consul api there are several options available (https://github.com/hashicorp/consul/blob/master/api/api.go#L126)

Authentication

Authentication is done through the above environment variables. Credentials can be ommitted in place of an EC2 Instance IAM profile with write access to the S3 Bucket.

Running

Running a backup:

% consul-snapshot backup
[INFO] v0.2.3: Starting Consul Snapshot
2017/08/16 09:33:25 [DEBUG] Backup starting on interval: 15s
2017/08/16 09:33:40 [INFO] Starting Backup At: 1502901220
2017/08/16 09:33:40 [INFO] Listing keys from consul
2017/08/16 09:33:40 [INFO] Converting 4 keys to JSON
2017/08/16 09:33:40 [INFO] Listing Prepared Queries from consul
2017/08/16 09:33:40 [INFO] Converting 0 keys to JSON
2017/08/16 09:33:40 [INFO] Listing ACLs from consul
2017/08/16 09:33:40 [INFO] ACL support detected as disbaled, skipping
2017/08/16 09:33:40 [INFO] Converting 0 ACLs to JSON
2017/08/16 09:33:40 [INFO] Preparing temporary directory for backup staging
2017/08/16 09:33:40 [INFO] Writing KVs to local backup file
2017/08/16 09:33:40 [DEBUG] Wrote 424 bytes to file, /tmp/macbook.local.consul.snapshot.1502901220/consul.kv.1502901220.json
2017/08/16 09:33:40 [INFO] Writing PQs to local backup file
2017/08/16 09:33:40 [DEBUG] Wrote 2 bytes to file, /tmp/macbook.local.consul.snapshot.1502901220/consul.pq.1502901220.json
2017/08/16 09:33:40 [INFO] Writing ACLs to local backup file
2017/08/16 09:33:40 [DEBUG] Wrote 2 bytes to file, /tmp/macbook.local.consul.snapshot.1502901220/consul.acl.1502901220.json
2017/08/16 09:33:40 [DEBUG] Wrote 339 bytes to file, /tmp/macbook.local.consul.snapshot.1502901220/meta.json
2017/08/16 09:33:40 [INFO] Writing Backup to Remote File
2017/08/16 09:33:40 [INFO] Uploading consul-backup-testing/backups/2017/8/16/macbook.local.consul.snapshot.1502901220.tar.gz to S3 in us-west-2
2017/08/16 09:33:40 [INFO] Running post processing
2017/08/16 09:33:40 [INFO] Backup completed successfully

Running a restore:

% consul-snapshot restore backups/2017/8/16/macbook.local.consul.snapshot.1502901220.tar.gz
[INFO] v0.2.3: Starting Consul Snapshot
2017/08/16 09:36:04 [DEBUG] Starting restore of consul-backup-testing/backups/2017/8/16/macbook.local.consul.snapshot.1502901220.tar.gz
2017/08/16 09:36:04 [INFO] Downloading consul-backup-testingbackups/2017/8/16/macbook.local.consul.snapshot.1502901220.tar.gz from S3 in us-west-2
2017/08/16 09:36:04 [INFO] Download completed
2017/08/16 09:36:04 [INFO] Checking encryption status of backup
2017/08/16 09:36:04 [INFO] Extracting backup
2017/08/16 09:36:04 [INFO] Inspecting backup contents
2017/08/16 09:36:04 [INFO] Found valid metadata of snapshot version 0.2.4 with unix_timestamp 1502901220
2017/08/16 09:36:04 [INFO] Parsing KV Data
2017/08/16 09:36:04 [INFO] Loaded 4 keys to restore
2017/08/16 09:36:04 [INFO] Parsing PQ Data
2017/08/16 09:36:04 [INFO] Loaded 0 Prepared Queries to restore
2017/08/16 09:36:04 [INFO] Parsing ACL Data
2017/08/16 09:36:04 [INFO] Loaded 0 ACLs to restore
2017/08/16 09:36:04 [INFO] Restored 4 keys with 0 errors
2017/08/16 09:36:04 [WARN] PQ restoration currently unsupported
2017/08/16 09:36:04 [WARN] ACL restoration currently unsupported
2017/08/16 09:36:04 [INFO] Restore completed.

Testing

There are some unit tests but not near full coverage.

There is an acceptance test that:

To run the acceptance test set ACCEPTANCE_TEST=1

Todos