GSA / data.gov

Main repository for the data.gov service
https://data.gov
Other
587 stars 91 forks source link

Application-independent way to backup/restore data services to/from S3 bucket #2768

Open adborden opened 3 years ago

adborden commented 3 years ago

User Story

In order to enable recovery from major outages, as well as snapshotting from production to other environments, the data.gov team wants provisioned data-services to be dumped to S3 storage regularly, with a documented and tested path for restoration.

Acceptance Criteria

Background

Security Considerations (required)

Backup and retention policy is documented in the SSP. The implementation should be consistent with what is documented.

Sketch

In more detail

Storage

Create a private S3 bucket in the gsa-datagov/management space, and call the instance service-dumps.

  cf t -s management
  cf create-service s3 basic service-dumps

Make the service accessible from the two environments (though it still "lives" in the management space)

  cf share-service service-dumps -s staging
  cf share-service service-dumps -s production

The backup-manager app

Make an app that will act as a utility for making and restoring backups across environments. The app should include:

The app should use the apt-buildpack to get those installed. (If the AWS CLI can't be installed using apt, then just curl it and unzip it in the app .profile.) Use binary-buildpack for the final buildpack.

The .profile should parse out creds for the service-dumps bucket and set the environment variables properly so that the aws CLI will be able to aws s3 cp to and from the bucket.

The app manifest should include a default start-command which summarizes other commands available:

Deploy the app in each space, but don't start it or give it a route

cf target -s staging
cf push backup-manager --task
cf target -s production
cf push backup-manager --task

Usage

Making backups
cf bind-service backup-manager my-service
cf run-task backup-manager --command "backup my-service"
cf unbind-service backup-manager my-service
Restoring backups

Restore from the most recent backup in this space

cf bind-service backup-manager my-service
cf run-task backup-manager --command "restore my-service"
cf unbind-service backup-manager my-service

Restore a particular backup from the production space:

cf bind-service backup-manager my-service
cf run-task backup-manager --command "restore my-service 20211122-2248 production"
cf unbind-service backup-manager my-service
adborden commented 2 years ago

https://github.com/gsa/cf-backup-manager now exists and can be expanded to meet this story. We've currently implemented enough to support backup/restore of mysql and postgres services.