aws-samples / amazon-rds-auto-restart-protection

A plug and play solution to automatically stop your RDS instances or Aurora clusters once restarted by AWS in order not to fall behind maintenance activities.
MIT No Attribution
8 stars 5 forks source link
aurora automation aws cost-optimization rds stop

Amazon RDS auto-restart protection

This is a plug and play solution to automatically stop your RDS instance once restarted by AWS in order not to fall behind maintenance activities.

Deployment

The solution is deployed using AWS CloudFormation

Keep in mind, application is deployed per region per account.

  1. Create an S3 bucket to upload your artifacts. For more information, see create bucket.
  2. Upload the following files to the root of the newly created S3 bucket:
    • stop-rds-instance-state-machine.json under sources/stepfunctions-code
    • 3 .zip files under sources/lambda-code-deployment-packages

      Lambda .py files are also available under sources/lambda-code. For more information on how to create a .zip deployment package, see python package.

  3. In AWS CloudFormation, start deploying deployment/master-template.yaml. For more information, see create stack.
  4. Finally, tag your RDS instance with auto-restart-protection = yes. Instances with the tag, will be automatically stopped once restarted after 7-days.

Configure notifications

The CloudFormation deployment creates an SNS topic SnsTopicWorkFlowNotification to which the AWS StepFunctions state machine publishes the workflow execution notification. Go to the SNS console (or CLI) and subscribe to the topic using SMS, E-mail or else. You'll receive successful as well as failed notifications.

Test your deployment

In order to test the solution, create a test RDS instance, tag it with auto-restart-protection tag and set the tag value to yes. While the RDS instance is still in starting state, test the Lambda function — start-statemachine-execution-lambda with a sample event that simulates that the cluster (RDS-EVENT-0153) or instance (RDS-EVENT-0154) wax started as it exceeded the maximum time to remain stopped.

To invoke a function

A sample Aurora cluster event:

Replace resources, account, SourceIdentifier and SourceArn

{
    "version": "0",
    "id": "a19938cd-14c7-8d2e-9d66-e9db582d2f4d",
    "detail-type": "RDS DB Cluster Event",
    "source": "aws.rds",
    "account": "123456789101",
    "time": "2022-03-07T02:38:03Z",
    "region": "ap-northeast-2",
    "resources": [
        "arn:aws:rds:ap-northeast-2:123456789101:cluster:cluster-name"
    ],
    "detail": {
        "EventCategories": [
            "configuration change"
        ],
        "SourceType": "CLUSTER",
        "SourceArn": "arn:aws:rds:ap-northeast-2:123456789101:cluster:cluster-name",
        "Date": "2022-03-07T02:38:03.747Z",
        "Message": "Finished updating DB parameter group",
        "SourceIdentifier": "cluster-name",
        "EventID": "RDS-EVENT-0153"
    }
}

A sample RDS instance event:

{
    "version": "0",
    "id": "a19938cd-14c7-8d2e-9d66-e9db582d2f4d",
    "detail-type": "RDS DB Instance Event",
    "source": "aws.rds",
    "account": "123456789101",
    "time": "2022-03-07T02:38:03Z",
    "region": "ap-northeast-2",
    "resources": [
        "arn:aws:rds:ap-northeast-2:123456789101:db:database-name"
    ],
    "detail": {
        "EventCategories": [
            "configuration change"
        ],
        "SourceType": "DB_INSTANCE",
        "SourceArn": "arn:aws:rds:ap-northeast-2:123456789101:db:database-name",
        "Date": "2022-03-07T02:38:03.747Z",
        "Message": "Finished updating DB parameter group",
        "SourceIdentifier": "database-name",
        "EventID": "RDS-EVENT-0154"
    }
}

start-statemachine-execution-lambda uses the id parameter as name for the AWS Step Functions execution. The name field is unique for a certain period of time, accordingly, with every test run the id parameter value must be changed.

Now, verify the execution of the AWS Step Functions state machine:

To verify an AWS Step Functions state machine execution status:

Security

See CONTRIBUTING for more information.

License

This library is licensed under the MIT-0 License. See the LICENSE file.