aws-samples / aws-codecommit-serverless-backup

A serverless solution to back up CodeCommit repositories to S3
MIT No Attribution
47 stars 60 forks source link

Introduction

AWS CodeCommit is a fully-managed source control service that makes it easy for companies to host secure and highly scalable private Git repositories. CodeCommit eliminates the need to operate your own source control system or worry about scaling its infrastructure. You can use CodeCommit to securely store anything from source code to binaries, and it works seamlessly with your existing Git tools.

You typically don't need to worry about backing up your CodeCommit repositories as CodeCommit's architecture is highly scalable, redundant, and durable. However, there are situations where backups might be helpful. For instance, if one accidentally deletes the CloudFormation stack that created the CodeCommit repository, the entire repository and its contents are also deleted for good. Oops.

As per AWS documentation: "Deleting an AWS CodeCommit repository is a destructive one-way operation that cannot be undone. To restore a deleted repository, you will need to create the repository again and use either a backup or a local copy from a full clone to upload the data".

So, having a backup handy is not a bad idea - Better safe than sorry!

The Solution

This project offers a serverless CodeCommit backup solution (who wants to manage servers these days?) that uses an Amazon CloudWatch event rule as a trigger (eg, trigger the backup every day at 2am UTC). (See the figure below for details). The CloudWatch event targets an AWS Lambda function that simply triggers an AWS CodeBuild container that generates a backup of all AWS CodeCommit repositories within a particular AWS account and region. The backup consists of .tar.gz files named after the repository's name and using a timestamp (eg, Repo1_2017_10_01_02_00). The backups are stored in a designated S3 bucket (eg, backup-bucket/Repo1, backup-bucket/Repo2, etc). One can use S3 lifecycle events to automatically move old backups into Amazon Glacier (cold storage) or alternatively specify an expiration policy for backup files in S3 to have them deleted after a certain period of time. Also as a security best practice, the S3 bucket storing the backups should enable default encryption.

approach-overview

Deploying the Solution

aws_profile="default"                  # default AWS profile (or choose another profile)
backup_schedule="cron(0 2 * * ? *)"    # backups scheduled for 2am UTC, everyday
scripts_s3_bucket="[S3-BUCKET-FOR-BACKUP-SCRIPTS]s" # bucket must exist in the SAME region the deployment is taking place
backups_s3_bucket="[S3-BUCKET-FOR-BACKUPS" # bucket must exist and have no policy that disallows PutObject from CodeBuild
stack_name="codecommit-backups"        # CloudFormation stack name for the solution
  chmod +x ./deploy.sh
  ./deploy.sh

Catch failed runs

License Summary

This sample code is made available under the MIT-0 license. See the LICENSE file.