zalando-stups / senza

Deploy immutable application stacks and create and execute AWS CloudFormation templates in a sane way
https://pypi.python.org/pypi/stups-senza
Other
96 stars 71 forks source link

Add WaitCondition for the Elastigroup custom resource #568

Open lmineiro opened 5 years ago

lmineiro commented 5 years ago

Native Auto Scaling Groups use a combination of CreationPolicy and the cfn-signal helper script to notify the CloudFormation stack of the success provisioning the number of required instances in the group.

Elastigroups are a custom resource and the CreationPolicy doesn't apply to them (Even if such parameter is currently missing from the Senza::Elastigroup component). For a custom resource such as Elastigroups Senza should use a WaitCondition, together with WaitConditionHandle enabling the same functionality as native Auto Scaling Groups.

Example: ```json { "AWSTemplateFormatVersion": "2010-09-09", "Description": "Test cfn-signal for Elastigroup", "Resources": { "ExampleElastigroup": { "Properties": { "ServiceToken": "arn:aws:lambda:eu-central-1:178579023202:function:spotinst-cloudformation", "accessToken": "XXXXXX", "accountId": "act-XXXXXX", "group": { "capacity": { "maximum": 1, "minimum": 1, "target": 1 }, "compute": { "instanceTypes": { "ondemand": "t3.medium", "spot": [ "t3.medium" ] }, "launchSpecification": { "imageId": "ami-XXXXXXX", "monitoring": false, "ebsOptimized": false, "keyPair": "XXXXXX", "securityGroupIds": [ "sg-XXXXXX" ], "tags": [ { "tagKey": "myTag", "tagValue": "myKey" } ], "userData": { "Fn::Base64": { "Fn::Join": [ "", [ "#!/bin/bash -xe\n", "yum install -y aws-cfn-bootstrap\n", "/opt/aws/bin/cfn-signal '", { "Ref": "WaitHandle" }, "'\n" ] ] } } }, "product": "Linux/UNIX", "availabilityZones": [ { "name": "eu-central-1b", "subnetIds": [ "subnet-XXXXXXX" ] } ] }, "name": "Example-Elastigroup", "strategy": { "risk": 100, "availabilityVsCost": "balanced", "drainingTimeout": 120, "fallbackToOd": true, "lifetimePeriod": "days", "persistence": {}, "revertToSpot": { "performAt": "always" } } } }, "Type": "Custom::elastigroup" }, "WaitHandle": { "Type": "AWS::CloudFormation::WaitConditionHandle" }, "WaitCondition": { "Type": "AWS::CloudFormation::WaitCondition", "DependsOn": "ExampleElastigroup", "Properties": { "Handle": { "Ref": "WaitHandle" }, "Timeout": "300", "Count": "1" } } } } ```

This change will require changes in Taupage, specifically in the init.sh script where the signaling is done currently.

This is believed to be the reason why Stacks with a Senza::Elastigroup component reach the CREATE_COMPLETE status before they have healthy instances ready to receive traffic, leading to some workaround before traffic switching.