cfn-modules / docs

Rapid CloudFormation: Modular, production ready, open source.
https://github.com/cfn-modules
Apache License 2.0
260 stars 40 forks source link

[Question or Request Feature] ECS Cluster with EC2 #21

Closed oanhnn closed 4 years ago

oanhnn commented 5 years ago

How to make an ECS Cluster from EC2 and ASG? I hope a module like https://github.com/widdix/aws-cf-templates/blob/master/ecs/cluster.yaml Current, i am using below code

---
# Copyright 2018 widdix GmbH
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
AWSTemplateFormatVersion: '2010-09-09'
Description: 'cfn-modules: AWS Auto Scaling Group singleton (Amazon Linux 2)'
# cfn-modules:implements(ExposeName, ExposeSecurityGroupId)
Parameters:
  VpcModule:
    Description: 'Stack name of vpc module.'
    Type: String
  AlertingModule:
    Description: 'Optional but recommended stack name of alerting module.'
    Type: String
    Default: ''
  BastionModule:
    Description: 'Optional but recommended stack name of module implementing Bastion.'
    Type: String
    Default: ''
  AlbModule:
    Description: 'Optional but recommended stack name of module implementing Alb.'
    Type: String
    Default: ''
  KeyName:
    Description: 'Optional key name of the Linux user ec2-user to establish a SSH connection to the EC2 instance.'
    Type: String
    Default: ''
  IAMUserSSHAccess:
    Description: 'Synchronize public keys of IAM users to enable personalized SSH access (https://github.com/widdix/aws-ec2-ssh)?'
    Type: String
    Default: false
    AllowedValues: [true, false]
  SystemsManagerAccess:
    Description: 'Enable AWS Systems Manager agent and Session Manager.'
    Type: String
    Default: true
    AllowedValues: [true, false]
  InstanceType:
    Description: 'The instance type for the EC2 instance.'
    Type: String
    Default: 't3.medium'
  InstanceName:
    Description: 'The name for the EC2 instance (auto generated if not set).'
    Type: String
    Default: ''
  SubnetReach:
    Description: 'Subnet reach.'
    Type: String
    Default: Public
    AllowedValues:
    - Public
    - Private
  LogsRetentionInDays:
    Description: 'Specifies the number of days you want to retain log events.'
    Type: Number
    Default: 14
    AllowedValues: [1, 3, 5, 7, 14, 30, 60, 90, 120, 150, 180, 365, 400, 545, 731, 1827, 3653]
  UserData:
    Description: 'Optional Bash script executed on first instance launch.'
    Type: String
    Default: ''
  IngressTcpPort1:
    Description: 'Optional port allowing ingress TCP traffic.'
    Type: String
    Default: ''
  IngressTcpPort2:
    Description: 'Optional port allowing ingress TCP traffic.'
    Type: String
    Default: ''
  IngressTcpPort3:
    Description: 'Optional port allowing ingress TCP traffic.'
    Type: String
    Default: ''
  ClientSgModule1:
    Description: 'Optional stack name of client-sg module to mark traffic from EC2 instance.'
    Type: String
    Default: ''
  ClientSgModule2:
    Description: 'Optional stack name of client-sg module to mark traffic from EC2 instance.'
    Type: String
    Default: ''
  ClientSgModule3:
    Description: 'Optional stack name of client-sg module to mark traffic from EC2 instance.'
    Type: String
    Default: ''
  FileSystemModule1:
    Description: 'Optional stack name of efs-file-system module.'
    Type: String
    Default: ''
  FileSystemModule2:
    Description: 'Optional stack name of efs-file-system module.'
    Type: String
    Default: ''
  FileSystemModule3:
    Description: 'Optional stack name of efs-file-system module.'
    Type: String
    Default: ''
  MaxSize:
    Description: 'The maximum size of the Auto Scaling group.'
    Type: Number
    Default: 4
    ConstraintDescription: 'Must be >= 1'
    MinValue: 1
  MinSize:
    Description: 'The minimum size of the Auto Scaling group.'
    Type: Number
    Default: 2
    ConstraintDescription: 'Must be >= 1'
    MinValue: 1
  DrainingTimeoutInSeconds:
    Description: 'Maximum time in seconds an EC2 instance waits when terminating until all containers are moved to another EC2 instance (draining).'
    Type: Number
    Default: 600 # 10 minutes
    ConstraintDescription: 'Must be in the range [60-86400]'
    MinValue: 60
    MaxValue: 86400 # 24 hours
  StopContainerTimeoutInSeconds:
    Description: 'Time in seconds the ECS agent waits before killing a stopped container (see ECS_CONTAINER_STOP_TIMEOUT).'
    Type: Number
    Default: 300 # 5 minutes
    ConstraintDescription: 'Must be in the range [30-3600]'
    MinValue: 30
    MaxValue: 3600 # 1 hour
  ContainerMaxCPU:
    Description: 'The maximum number of cpu reservation per container that you plan to run on this cluster. A container instance has 1,024 CPU units for every CPU core.'
    Type: Number
    Default: 128
  ContainerMaxMemory:
    Description: 'The maximum number of memory reservation (in MB)  per container that you plan to run on this cluster.'
    Type: Number
    Default: 128
  ContainerShortageThreshold:
    Description: 'Scale up if free cluster capacity <= containers (based on ContainerMaxCPU and ContainerMaxMemory settings)'
    Type: Number
    Default: 2
    MinValue: 0
    ConstraintDescription: 'Must be >= 0'
  ContainerExcessThreshold:
    Description: 'Scale down if free cluster capacity >= containers (based on ContainerMaxCPU and ContainerMaxMemory settings)'
    Type: Number
    Default: 10
    MinValue: 2
    ConstraintDescription: 'Must be >= 2'
  ManagedPolicyArns:
    Description: 'Optional comma-delimited list of IAM managed policy ARNs to attach to the instance''s IAM role'
    Type: String
    Default: ''
Mappings:
  RegionMap:
    'eu-north-1':
      ECSAMI: 'ami-0dddc4daca44e6e99'
    'ap-south-1':
      ECSAMI: 'ami-04322e867758d97a8'
    'eu-west-3':
      ECSAMI: 'ami-07273195833e4f20c'
    'eu-west-2':
      ECSAMI: 'ami-0204aa6a92a54561e'
    'eu-west-1':
      ECSAMI: 'ami-0c5abd45f676aab4f'
    'ap-northeast-2':
      ECSAMI: 'ami-08834c8c57e502d6d'
    'ap-northeast-1':
      ECSAMI: 'ami-0e52aad6ac7733a6a'
    'sa-east-1':
      ECSAMI: 'ami-00d851648873aaabc'
    'ca-central-1':
      ECSAMI: 'ami-0498c464ec4d2ba83'
    'ap-southeast-1':
      ECSAMI: 'ami-0047bfdb16f1f6781'
    'ap-southeast-2':
      ECSAMI: 'ami-09475847322e5566f'
    'eu-central-1':
      ECSAMI: 'ami-096a38c97b80cd8ec'
    'us-east-1':
      ECSAMI: 'ami-00cf4737e238866a3'
    'us-east-2':
      ECSAMI: 'ami-012ca23958772cf72'
    'us-west-1':
      ECSAMI: 'ami-06d87f0156b1d4407'
    'us-west-2':
      ECSAMI: 'ami-0a9f5be2a016dccad'
Conditions:
  HasAlertingModule: !Not [!Equals [!Ref AlertingModule, '']]
  HasBastionModule: !Not [!Equals [!Ref BastionModule, '']]
  HasNotBastionModule: !Not [!Condition HasBastionModule]
  HasFileSystemModule1: !Not [!Equals [!Ref FileSystemModule1, '']]
  HasFileSystemModule2: !Not [!Equals [!Ref FileSystemModule2, '']]
  HasFileSystemModule3: !Not [!Equals [!Ref FileSystemModule3, '']]
  HasAlbModule: !Not [!Equals [!Ref AlbModule, '']]
  HasKeyName: !Not [!Equals [!Ref KeyName, '']]
  HasIAMUserSSHAccess: !Equals [!Ref IAMUserSSHAccess, 'true']
  HasSystemsManagerAccess: !Equals [!Ref SystemsManagerAccess, 'true']
  HasInstanceName: !Not [!Equals [!Ref InstanceName, '']]
  HasSubnetReachPublic: !Equals [!Ref SubnetReach, Public]
  HasIngressTcpPort1: !Not [!Equals [!Ref IngressTcpPort1, '']]
  HasIngressTcpPort2: !Not [!Equals [!Ref IngressTcpPort2, '']]
  HasIngressTcpPort3: !Not [!Equals [!Ref IngressTcpPort3, '']]
  HasClientSgModule1: !Not [!Equals [!Ref ClientSgModule1, '']]
  HasClientSgModule2: !Not [!Equals [!Ref ClientSgModule2, '']]
  HasClientSgModule3: !Not [!Equals [!Ref ClientSgModule3, '']]
  HasManagedPolicyArns: !Not [!Equals [!Ref ManagedPolicyArns, '']]
Resources:
  Cluster:
    Type: 'AWS::ECS::Cluster'
    Properties: {}
  LogGroup:
    Type: 'AWS::Logs::LogGroup'
    Properties:
      RetentionInDays: !Ref LogsRetentionInDays
  SecurityGroup:
    Type: AWS::EC2::SecurityGroup
    Properties:
      GroupDescription: !Ref 'AWS::StackName'
      VpcId:
        'Fn::ImportValue': !Sub '${VpcModule}-Id'
  SecurityGroupIngressALB:
    Type: 'AWS::EC2::SecurityGroupIngress'
    Condition: HasAlbModule
    Properties:
      GroupId: !Ref SecurityGroup
      IpProtocol: tcp
      FromPort: 0
      ToPort: 65535
      SourceSecurityGroupId:
        'Fn::ImportValue': !Sub '${AlbModule}-SecurityGroupId'
  SecurityGroupIngressSSHBastion:
    Type: 'AWS::EC2::SecurityGroupIngress'
    Condition: HasBastionModule
    Properties:
      GroupId: !Ref SecurityGroup
      IpProtocol: tcp
      FromPort: 22
      ToPort: 22
      SourceSecurityGroupId:
        'Fn::ImportValue': !Sub '${BastionModule}-SecurityGroupId'
  SecurityGroupIngressSSHWorld:
    Type: 'AWS::EC2::SecurityGroupIngress'
    Condition: HasNotBastionModule
    Properties:
      GroupId: !Ref SecurityGroup
      IpProtocol: tcp
      FromPort: 22
      ToPort: 22
      CidrIp: '0.0.0.0/0'
  SecurityGroupIngressTcpPort1:
    Type: 'AWS::EC2::SecurityGroupIngress'
    Condition: HasIngressTcpPort1
    Properties:
      GroupId: !Ref SecurityGroup
      IpProtocol: tcp
      FromPort: !Ref IngressTcpPort1
      ToPort: !Ref IngressTcpPort1
      CidrIp: '0.0.0.0/0'
  SecurityGroupIngressTcpPort2:
    Type: 'AWS::EC2::SecurityGroupIngress'
    Condition: HasIngressTcpPort2
    Properties:
      GroupId: !Ref SecurityGroup
      IpProtocol: tcp
      FromPort: !Ref IngressTcpPort2
      ToPort: !Ref IngressTcpPort2
      CidrIp: '0.0.0.0/0'
  SecurityGroupIngressTcpPort3:
    Type: 'AWS::EC2::SecurityGroupIngress'
    Condition: HasIngressTcpPort3
    Properties:
      GroupId: !Ref SecurityGroup
      IpProtocol: tcp
      FromPort: !Ref IngressTcpPort3
      ToPort: !Ref IngressTcpPort3
      CidrIp: '0.0.0.0/0'
  InstanceProfile:
    Type: 'AWS::IAM::InstanceProfile'
    Properties:
      Roles:
      - !Ref Role
  Role:
    Type: 'AWS::IAM::Role'
    Properties:
      AssumeRolePolicyDocument:
        Version: '2012-10-17'
        Statement:
        - Effect: Allow
          Principal:
            Service: 'ec2.amazonaws.com'
          Action: 'sts:AssumeRole'
      ManagedPolicyArns: !If [HasManagedPolicyArns, !Split [',', !Ref ManagedPolicyArns], !Ref 'AWS::NoValue']
      Policies:
      - !If
        - HasSystemsManagerAccess
        - PolicyName: ssm
          PolicyDocument:
            Version: '2012-10-17'
            Statement:
            - Effect: Allow
              Action:
              - 'ssmmessages:*' # SSM Agent by https://docs.aws.amazon.com/systems-manager/latest/userguide/systems-manager-setting-up-messageAPIs.html
              - 'ssm:UpdateInstanceInformation' # SSM agent by https://docs.aws.amazon.com/systems-manager/latest/userguide/systems-manager-setting-up-messageAPIs.html
              - 'ec2messages:*' # SSM Session Manager by https://docs.aws.amazon.com/systems-manager/latest/userguide/systems-manager-setting-up-messageAPIs.html
              Resource: '*'
        - !Ref 'AWS::NoValue'
      - PolicyName: logs
        PolicyDocument:
          Version: '2012-10-17'
          Statement:
          - Effect: Allow
            Action:
            - 'logs:CreateLogGroup'
            - 'logs:CreateLogStream'
            - 'logs:PutLogEvents'
            - 'logs:DescribeLogStreams'
            Resource: !GetAtt 'LogGroup.Arn'
      - PolicyName: ecs
        PolicyDocument:
          Version: '2012-10-17'
          Statement:
          - Effect: Allow
            Action:
            - 'ecs:DiscoverPollEndpoint'
            Resource: '*'
          - Effect: Allow
            Action:
            - 'ecs:DeregisterContainerInstance'
            - 'ecs:RegisterContainerInstance'
            - 'ecs:SubmitContainerStateChange'
            - 'ecs:SubmitTaskStateChange'
            - 'ecs:ListContainerInstances'
            Resource: !Sub 'arn:aws:ecs:${AWS::Region}:${AWS::AccountId}:cluster/${Cluster}'
          - Effect: Allow
            Action:
            - 'ecs:Poll'
            - 'ecs:StartTelemetrySession'
            - 'ecs:UpdateContainerInstancesState'
            - 'ecs:ListTasks'
            - 'ecs:DescribeContainerInstances'
            Resource: !Sub 'arn:aws:ecs:${AWS::Region}:${AWS::AccountId}:container-instance/*'
            Condition:
              'StringEquals':
                'ecs:cluster':
                  !Sub 'arn:aws:ecs:${AWS::Region}:${AWS::AccountId}:cluster/${Cluster}'
      - PolicyName: ecr
        PolicyDocument:
          Version: '2012-10-17'
          Statement:
          - Effect: Allow
            Action:
            - 'ecr:GetAuthorizationToken'
            - 'ecr:BatchCheckLayerAvailability'
            - 'ecr:GetDownloadUrlForLayer'
            - 'ecr:BatchGetImage'
            Resource: '*'
      - PolicyName: autoscaling
        PolicyDocument:
          Version: '2012-10-17'
          Statement:
          - Sid: write
            Effect: Allow
            Action: 'autoscaling:CompleteLifecycleAction'
            Resource: '*'
      - PolicyName: sqs
        PolicyDocument:
          Version: '2012-10-17'
          Statement:
          - Sid: write
            Effect: Allow
            Action:
            - 'sqs:DeleteMessage'
            - 'sqs:ReceiveMessage'
            Resource: !GetAtt 'AutoScalingGroupLifecycleHookQueue.Arn'
  PolicySshAccess:
    Type: 'AWS::IAM::Policy'
    Condition: HasIAMUserSSHAccess
    Properties:
      Roles:
      - !Ref Role
      PolicyName: 'ssh-access'
      PolicyDocument:
        Version: '2012-10-17'
        Statement:
        - Effect: Allow
          Action:
          - 'iam:ListUsers'
          - 'iam:GetGroup'
          Resource: '*'
        - Effect: Allow
          Action:
          - 'iam:ListSSHPublicKeys'
          - 'iam:GetSSHPublicKey'
          Resource: !Sub 'arn:${AWS::Partition}:iam::${AWS::AccountId}:user/*'
        - Effect: Allow
          Action: 'ec2:DescribeTags'
          Resource: '*'
  PolicyAssociateAddress:
    Type: 'AWS::IAM::Policy'
    Condition: HasSubnetReachPublic
    Properties:
      Roles:
      - !Ref Role
      PolicyName: 'ec2'
      PolicyDocument:
        Version: '2012-10-17'
        Statement:
        - Effect: Allow
          Action: 'ec2:AssociateAddress'
          Resource: '*'
  LaunchConfiguration:
    Type: 'AWS::AutoScaling::LaunchConfiguration'
    Metadata:
      'AWS::CloudFormation::Init':
        configSets:
          default: !If [HasIAMUserSSHAccess, [awslogs, ssh-access, install], [awslogs, install]]
        awslogs:
          packages:
            yum:
              awslogs: []
          files:
            '/etc/awslogs/awscli.conf':
              content: !Sub |
                [default]
                region = ${AWS::Region}
                [plugins]
                cwlogs = cwlogs
              mode: '000644'
              owner: root
              group: root
            '/etc/awslogs/awslogs.conf':
              content: !Sub |
                [general]
                state_file = /var/lib/awslogs/agent-state
                [/var/log/amazon/ssm/amazon-ssm-agent.log]
                datetime_format = %Y-%m-%d %H:%M:%S
                file = /var/log/amazon/ssm/amazon-ssm-agent.log
                log_stream_name = {instance_id}/var/log/amazon/ssm/amazon-ssm-agent.log
                log_group_name = ${LogGroup}
                [/var/log/amazon/ssm/errors.log]
                datetime_format = %Y-%m-%d %H:%M:%S
                file = /var/log/amazon/ssm/errors.log
                log_stream_name = {instance_id}/var/log/amazon/ssm/errors.log
                log_group_name = ${LogGroup}
                [/var/log/audit/audit.log]
                file = /var/log/audit/audit.log
                log_stream_name = {instance_id}/var/log/audit/audit.log
                log_group_name = ${LogGroup}
                [/var/log/awslogs.log]
                datetime_format = %Y-%m-%d %H:%M:%S
                file = /var/log/awslogs.log
                log_stream_name = {instance_id}/var/log/awslogs.log
                log_group_name = ${LogGroup}
                [/var/log/boot.log]
                file = /var/log/boot.log
                log_stream_name = {instance_id}/var/log/boot.log
                log_group_name = ${LogGroup}
                [/var/log/cfn-hup.log]
                datetime_format = %Y-%m-%d %H:%M:%S
                file = /var/log/cfn-hup.log
                log_stream_name = {instance_id}/var/log/cfn-hup.log
                log_group_name = ${LogGroup}
                [/var/log/cfn-init-cmd.log]
                datetime_format = %Y-%m-%d %H:%M:%S
                file = /var/log/cfn-init-cmd.log
                log_stream_name = {instance_id}/var/log/cfn-init-cmd.log
                log_group_name = ${LogGroup}
                [/var/log/cfn-init.log]
                datetime_format = %Y-%m-%d %H:%M:%S
                file = /var/log/cfn-init.log
                log_stream_name = {instance_id}/var/log/cfn-init.log
                log_group_name = ${LogGroup}
                [/var/log/cfn-wire.log]
                datetime_format = %Y-%m-%d %H:%M:%S
                file = /var/log/cfn-wire.log
                log_stream_name = {instance_id}/var/log/cfn-wire.log
                log_group_name = ${LogGroup}
                [/var/log/cloud-init-output.log]
                file = /var/log/cloud-init-output.log
                log_stream_name = {instance_id}/var/log/cloud-init-output.log
                log_group_name = ${LogGroup}
                [/var/log/cloud-init.log]
                datetime_format = %b %d %H:%M:%S
                file = /var/log/cloud-init.log
                log_stream_name = {instance_id}/var/log/cloud-init.log
                log_group_name = ${LogGroup}
                [/var/log/cron]
                datetime_format = %b %d %H:%M:%S
                file = /var/log/cron
                log_stream_name = {instance_id}/var/log/cron
                log_group_name = ${LogGroup}
                [/var/log/dmesg]
                file = /var/log/dmesg
                log_stream_name = {instance_id}/var/log/dmesg
                log_group_name = ${LogGroup}
                [/var/log/grubby_prune_debug]
                file = /var/log/grubby_prune_debug
                log_stream_name = {instance_id}/var/log/grubby_prune_debug
                log_group_name = ${LogGroup}
                [/var/log/maillog]
                datetime_format = %b %d %H:%M:%S
                file = /var/log/maillog
                log_stream_name = {instance_id}/var/log/maillog
                log_group_name = ${LogGroup}
                [/var/log/messages]
                datetime_format = %b %d %H:%M:%S
                file = /var/log/messages
                log_stream_name = {instance_id}/var/log/messages
                log_group_name = ${LogGroup}
                [/var/log/secure]
                datetime_format = %b %d %H:%M:%S
                file = /var/log/secure
                log_stream_name = {instance_id}/var/log/secure
                log_group_name = ${LogGroup}
                [/var/log/yum.log]
                datetime_format = %b %d %H:%M:%S
                file = /var/log/yum.log
                log_stream_name = {instance_id}/var/log/yum.log
                log_group_name = ${LogGroup}
              mode: '000644'
              owner: root
              group: root
            '/etc/awslogs/config/ecs.conf':
              content: !Sub |
                [/var/log/ecs/ecs-init.log]
                file = /var/log/ecs/ecs-init.log
                log_group_name = /var/log/ecs/ecs-init.log
                log_stream_name = {instance_id}/var/log/ecs/ecs-init.log
                datetime_format = %Y-%m-%dT%H:%M:%SZ
                [/var/log/ecs/ecs-agent.log]
                file = /var/log/ecs/ecs-agent.log.*
                log_stream_name = {instance_id}/var/log/ecs/ecs-agent.log
                log_group_name = ${LogGroup}
                datetime_format = %Y-%m-%dT%H:%M:%SZ
              mode: '000644'
              owner: root
              group: root
          services:
            sysvinit:
              awslogsd:
                enabled: true
                ensureRunning: true
                packages:
                  yum:
                  - awslogs
                files:
                - '/etc/awslogs/awslogs.conf'
                - '/etc/awslogs/awscli.conf'
                - '/etc/awslogs/config/ecs.conf'
        ssh-access:
          packages:
            rpm:
              aws-ec2-ssh: 'https://s3-eu-west-1.amazonaws.com/widdix-aws-ec2-ssh-releases-eu-west-1/aws-ec2-ssh-1.9.2-1.el7.centos.noarch.rpm'
          commands:
            a_configure_sudo:
              command: 'sed -i ''s/SUDOERS_GROUPS=""/SUDOERS_GROUPS="##ALL##"/g'' /etc/aws-ec2-ssh.conf'
              test: 'grep -q ''SUDOERS_GROUPS=""'' /etc/aws-ec2-ssh.conf'
            b_enable:
              command: 'sed -i ''s/DONOTSYNC=1/DONOTSYNC=0/g'' /etc/aws-ec2-ssh.conf && /usr/bin/import_users.sh'
              test: 'grep -q ''DONOTSYNC=1'' /etc/aws-ec2-ssh.conf'
        install:
          packages:
            yum:
              amazon-ssm-agent: []
          files:
            '/etc/cfn/cfn-hup.conf':
              content: !Sub |
                [main]
                stack=${AWS::StackId}
                region=${AWS::Region}
                interval=1
              mode: '000400'
              owner: root
              group: root
            '/etc/cfn/hooks.d/cfn-auto-reloader.conf':
              content: !Sub |
                [cfn-auto-reloader-hook]
                triggers=post.update
                path=Resources.LaunchConfiguration.Metadata.AWS::CloudFormation::Init
                action=/opt/aws/bin/cfn-init --verbose --stack=${AWS::StackName} --region=${AWS::Region} --resource=LaunchConfiguration
                runas=root
          services:
            sysvinit:
              cfn-hup:
                enabled: true
                ensureRunning: true
                files:
                - '/etc/cfn/cfn-hup.conf'
                - '/etc/cfn/hooks.d/cfn-auto-reloader.conf'
              amazon-ssm-agent:
                enabled: !If [HasSystemsManagerAccess, true, false]
                ensureRunning: !If [HasSystemsManagerAccess, true, false]
                packages:
                  yum:
                  - amazon-ssm-agent
    Properties:
      AssociatePublicIpAddress: !If [HasSubnetReachPublic, true, false]
      IamInstanceProfile: !Ref InstanceProfile
      ImageId: !FindInMap [RegionMap, !Ref 'AWS::Region', ECSAMI]
      InstanceMonitoring: false
      InstanceType: !Ref InstanceType
      KeyName: !If [HasKeyName, !Ref KeyName, !Ref 'AWS::NoValue']
      SecurityGroups:
      - !Ref SecurityGroup
      - !If [HasClientSgModule1, {'Fn::ImportValue': !Sub '${ClientSgModule1}-SecurityGroupId'}, !Ref 'AWS::NoValue']
      - !If [HasClientSgModule2, {'Fn::ImportValue': !Sub '${ClientSgModule2}-SecurityGroupId'}, !Ref 'AWS::NoValue']
      - !If [HasClientSgModule3, {'Fn::ImportValue': !Sub '${ClientSgModule3}-SecurityGroupId'}, !Ref 'AWS::NoValue']
      UserData:
        'Fn::Base64': !Sub
        - |
            #!/bin/bash -ex
            trap '/opt/aws/bin/cfn-signal -e 1 --region ${Region} --stack ${StackName} --resource AutoScalingGroup' ERR
            echo "ECS_CLUSTER=${Cluster}" >> /etc/ecs/ecs.config
            echo "ECS_CONTAINER_STOP_TIMEOUT=${StopContainerTimeoutInSeconds}s" >> /etc/ecs/ecs.config
            yum install -y aws-cfn-bootstrap
            ${UserDataMountFileSystem1}
            ${UserDataMountFileSystem2}
            ${UserDataMountFileSystem3}
            mount -a
            /opt/aws/bin/cfn-init -v --region ${Region} --stack ${StackName} --resource LaunchConfiguration
            ${UserData}
            /opt/aws/bin/cfn-signal -e 0 --region ${Region} --stack ${StackName} --resource AutoScalingGroup
        - Region: !Ref 'AWS::Region'
          StackName: !Ref 'AWS::StackName'
          UserDataMountFileSystem1: !If [HasFileSystemModule1, !Join ['', ['yum install -y amazon-efs-utils && mkdir -p /mnt/efs1 && echo "', {'Fn::ImportValue': !Sub '${FileSystemModule1}-Id'}, ':/ /mnt/efs1 efs defaults,_netdev 0 0" >> /etc/fstab']], '']
          UserDataMountFileSystem2: !If [HasFileSystemModule2, !Join ['', ['yum install -y amazon-efs-utils && mkdir -p /mnt/efs2 && echo "', {'Fn::ImportValue': !Sub '${FileSystemModule2}-Id'}, ':/ /mnt/efs2 efs defaults,_netdev 0 0" >> /etc/fstab']], '']
          UserDataMountFileSystem3: !If [HasFileSystemModule3, !Join ['', ['yum install -y amazon-efs-utils && mkdir -p /mnt/efs3 && echo "', {'Fn::ImportValue': !Sub '${FileSystemModule3}-Id'}, ':/ /mnt/efs3 efs defaults,_netdev 0 0" >> /etc/fstab']], '']
          UserData: !Ref UserData
  AutoScalingGroup:
    Type: 'AWS::AutoScaling::AutoScalingGroup'
    Properties:
      LaunchConfigurationName: !Ref LaunchConfiguration
      MaxSize: !Ref MaxSize
      MinSize: !Ref MinSize
      Cooldown: '120'
      HealthCheckGracePeriod: 300
      # HealthCheckType: ELB
      # TargetGroupARNs:
      # - !Ref DefaultTargetGroup
      NotificationConfigurations: !If
      - HasAlertingModule
      - - NotificationTypes:
          - 'autoscaling:EC2_INSTANCE_LAUNCH_ERROR'
          - 'autoscaling:EC2_INSTANCE_TERMINATE_ERROR'
          TopicARN:
            'Fn::ImportValue': !Sub '${AlertingModule}-Arn'
      - []
      Tags:
      - Key: Name
        Value: !If [HasInstanceName, !Ref InstanceName, !Sub '${AWS::StackName}-instance']
        PropagateAtLaunch: true
      VPCZoneIdentifier: !Split
      - ','
      - 'Fn::ImportValue': !Sub '${VpcModule}-SubnetIds${SubnetReach}'
    CreationPolicy:
      ResourceSignal:
        Count: 1
        Timeout: PT15M
    UpdatePolicy:
      AutoScalingRollingUpdate:
        PauseTime: PT15M
        SuspendProcesses:
        - HealthCheck
        - ReplaceUnhealthy
        - AZRebalance
        - AlarmNotification
        - ScheduledActions
        WaitOnResourceSignals: true
  CPUTooHighAlarm:
    Condition: HasAlertingModule
    Type: 'AWS::CloudWatch::Alarm'
    Properties:
      AlarmDescription: 'Average CPU utilization over last 10 minutes higher than 80%'
      Namespace: 'AWS/EC2'
      MetricName: CPUUtilization
      Statistic: Average
      Period: 600
      EvaluationPeriods: 1
      ComparisonOperator: GreaterThanThreshold
      Threshold: 80
      AlarmActions:
      - 'Fn::ImportValue': !Sub '${AlertingModule}-Arn'
      Dimensions:
      - Name: AutoScalingGroupName
        Value: !Ref AutoScalingGroup
  AutoScalingGroupLifecycleHookQueue:
    Type: 'AWS::SQS::Queue'
    Properties:
      QueueName: !Sub '${AWS::StackName}-lifecycle-hook'
      VisibilityTimeout: 60
      RedrivePolicy:
        deadLetterTargetArn: !GetAtt 'AutoScalingGroupLifecycleHookDeadLetterQueue.Arn'
        maxReceiveCount: 5
  AutoScalingGroupLifecycleHookQueueTooHighAlarm:
    Condition: HasAlertingModule
    Type: 'AWS::CloudWatch::Alarm'
    Properties:
      AlarmDescription: 'Queue contains messages older than 10 minutes, messages are not consumed'
      Namespace: 'AWS/SQS'
      MetricName: ApproximateAgeOfOldestMessage
      Statistic: Maximum
      Period: 60
      EvaluationPeriods: 1
      ComparisonOperator: GreaterThanThreshold
      Threshold: 600
      AlarmActions:
      - 'Fn::ImportValue': !Sub '${AlertingModule}-Arn'
      Dimensions:
      - Name: QueueName
        Value: !GetAtt 'AutoScalingGroupLifecycleHookQueue.QueueName'
  AutoScalingGroupLifecycleHookDeadLetterQueue:
    Type: 'AWS::SQS::Queue'
    Properties:
      QueueName: !Sub '${AWS::StackName}-lifecycle-hook-dlq'
  AutoScalingGroupLifecycleHookDeadLetterQueueTooHighAlarm:
    Condition: HasAlertingModule
    Type: 'AWS::CloudWatch::Alarm'
    Properties:
      AlarmDescription: 'Dead letter queue contains messages, message processing failed'
      Namespace: 'AWS/SQS'
      MetricName: ApproximateNumberOfMessagesVisible
      Statistic: Sum
      Period: 60
      EvaluationPeriods: 1
      ComparisonOperator: GreaterThanThreshold
      Threshold: 0
      AlarmActions:
      - 'Fn::ImportValue': !Sub '${AlertingModule}-Arn'
      Dimensions:
      - Name: QueueName
        Value: !GetAtt 'AutoScalingGroupLifecycleHookDeadLetterQueue.QueueName'
  AutoScalingGroupLifecycleHookIAMRole:
    Type: 'AWS::IAM::Role'
    Properties:
      AssumeRolePolicyDocument:
        Version: '2012-10-17'
        Statement:
        - Effect: Allow
          Principal:
            Service: 'autoscaling.amazonaws.com'
          Action: 'sts:AssumeRole'
      Policies:
      - PolicyName: sqs
        PolicyDocument:
          Version: '2012-10-17'
          Statement:
          - Sid: write
            Effect: Allow
            Action:
            - 'sqs:SendMessage'
            - 'sqs:GetQueueUrl'
            Resource: !GetAtt 'AutoScalingGroupLifecycleHookQueue.Arn'
  AutoScalingGroupTerminatingLifecycleHook:
    Type: 'AWS::AutoScaling::LifecycleHook'
    Properties:
      HeartbeatTimeout: 600
      DefaultResult: CONTINUE
      AutoScalingGroupName: !Ref AutoScalingGroup
      LifecycleTransition: 'autoscaling:EC2_INSTANCE_TERMINATING'
      NotificationTargetARN: !GetAtt 'AutoScalingGroupLifecycleHookQueue.Arn'
      RoleARN: !GetAtt 'AutoScalingGroupLifecycleHookIAMRole.Arn'
  ScaleUpPolicy:
    Type: 'AWS::AutoScaling::ScalingPolicy'
    Properties:
      AutoScalingGroupName: !Ref AutoScalingGroup
      PolicyType: StepScaling
      AdjustmentType: PercentChangeInCapacity
      MinAdjustmentMagnitude: 1
      StepAdjustments:
      - MetricIntervalUpperBound: 0.0
        ScalingAdjustment: 25
  ScaleDownPolicy:
    Type: 'AWS::AutoScaling::ScalingPolicy'
    Properties:
      AutoScalingGroupName: !Ref AutoScalingGroup
      PolicyType: StepScaling
      AdjustmentType: PercentChangeInCapacity
      MinAdjustmentMagnitude: 1
      StepAdjustments:
      - MetricIntervalLowerBound: 0.0
        ScalingAdjustment: -25
  ContainerInstancesShortageAlarm:
    Type: 'AWS::CloudWatch::Alarm'
    Properties:
      AlarmDescription: 'Cluster is running out of container instances'
      Namespace: !Ref 'AWS::StackName'
      Dimensions:
      - Name: ClusterName
        Value: !Ref Cluster
      MetricName: SchedulableContainers
      ComparisonOperator: LessThanOrEqualToThreshold
      Statistic: Minimum # special rule because we scale on reservations and not utilization
      Period: 60
      EvaluationPeriods: 1
      Threshold: !Ref ContainerShortageThreshold
      AlarmActions:
      - !Ref ScaleUpPolicy
  ContainerInstancesExcessAlarm:
    Type: 'AWS::CloudWatch::Alarm'
    Properties:
      AlarmDescription: 'Cluster is wasting container instances'
      Namespace: !Ref 'AWS::StackName'
      Dimensions:
      - Name: ClusterName
        Value: !Ref Cluster
      MetricName: SchedulableContainers
      ComparisonOperator: GreaterThanOrEqualToThreshold
      Statistic: Maximum # special rule because we scale on reservations and not utilization
      Period: 60
      EvaluationPeriods: 15
      DatapointsToAlarm: 15
      Threshold: !Ref ContainerExcessThreshold
      AlarmActions:
      - !Ref ScaleDownPolicy
  CPUReservationTooHighAlarm:
    Condition: HasAlertingModule
    Type: 'AWS::CloudWatch::Alarm'
    Properties:
      AlarmDescription: 'Average CPU reservation over last 10 minutes higher than 90%'
      Namespace: 'AWS/ECS'
      MetricName: CPUReservation
      Statistic: Average # special rule because we scale on reservations and not utilization
      Period: 600
      EvaluationPeriods: 1
      ComparisonOperator: GreaterThanThreshold
      Threshold: 90
      AlarmActions:
      - 'Fn::ImportValue': !Sub '${AlertingModule}-Arn'
      Dimensions:
      - Name: ClusterName
        Value: !Ref Cluster
  CPUUtilizationTooHighAlarm:
    Condition: HasAlertingModule
    Type: 'AWS::CloudWatch::Alarm'
    Properties:
      AlarmDescription: 'Average CPU utilization over last 10 minutes higher than 80%'
      Namespace: 'AWS/ECS'
      MetricName: CPUUtilization
      Statistic: Average
      Period: 600
      EvaluationPeriods: 1
      ComparisonOperator: GreaterThanThreshold
      Threshold: 80
      AlarmActions:
      - 'Fn::ImportValue': !Sub '${AlertingModule}-Arn'
      Dimensions:
      - Name: ClusterName
        Value: !Ref Cluster
  MemoryReservationTooHighAlarm:
    Condition: HasAlertingModule
    Type: 'AWS::CloudWatch::Alarm'
    Properties:
      AlarmDescription: 'Average memory reservation over last 10 minutes higher than 90%'
      Namespace: 'AWS/ECS'
      MetricName: MemoryReservation
      Statistic: Average # special rule because we scale on reservations and not utilization
      Period: 600
      EvaluationPeriods: 1
      ComparisonOperator: GreaterThanThreshold
      Threshold: 90
      AlarmActions:
      - 'Fn::ImportValue': !Sub '${AlertingModule}-Arn'
      Dimensions:
      - Name: ClusterName
        Value: !Ref Cluster
  MemoryUtilizationTooHighAlarm:
    Condition: HasAlertingModule
    Type: 'AWS::CloudWatch::Alarm'
    Properties:
      AlarmDescription: 'Average memory utilization over last 10 minutes higher than 80%'
      Namespace: 'AWS/ECS'
      MetricName: MemoryUtilization
      Statistic: Average
      Period: 600
      EvaluationPeriods: 1
      ComparisonOperator: GreaterThanThreshold
      Threshold: 80
      AlarmActions:
      - 'Fn::ImportValue': !Sub '${AlertingModule}-Arn'
      Dimensions:
      - Name: ClusterName
        Value: !Ref Cluster
  # scaling based on SchedulableContainers is described in detail here: http://garbe.io/blog/2017/04/12/a-better-solution-to-ecs-autoscaling/
  SchedulableContainersCron:
    DependsOn:
    - SchedulableContainersLambdaPolicy
    Type: 'AWS::Events::Rule'
    Properties:
      ScheduleExpression: 'rate(1 minute)'
      State: ENABLED
      Targets:
      - Arn: !GetAtt 'SchedulableContainersLambdaV2.Arn'
        Id: lambda
  SchedulableContainersCronFailedInvocationsTooHighAlarm:
    Condition: HasAlertingModule
    Type: 'AWS::CloudWatch::Alarm'
    Properties:
      AlarmDescription: 'Invocations failed permanently'
      Namespace: 'AWS/Events'
      MetricName: FailedInvocations
      Statistic: Sum
      Period: 60
      EvaluationPeriods: 1
      ComparisonOperator: GreaterThanThreshold
      Threshold: 0
      AlarmActions:
      - 'Fn::ImportValue': !Sub '${AlertingModule}-Arn'
      Dimensions:
      - Name: RuleName
        Value: !Ref SchedulableContainersCron
  SchedulableContainersLambdaRole:
    Type: 'AWS::IAM::Role'
    Properties:
      AssumeRolePolicyDocument:
        Version: '2012-10-17'
        Statement:
        - Effect: Allow
          Principal:
            Service: 'lambda.amazonaws.com'
          Action: 'sts:AssumeRole'
      Policies:
      - PolicyName: ecs
        PolicyDocument:
          Statement:
          - Effect: Allow
            Action: 'ecs:ListContainerInstances'
            Resource: !Sub 'arn:aws:ecs:${AWS::Region}:${AWS::AccountId}:cluster/${Cluster}'
          - Effect: Allow
            Action: 'ecs:DescribeContainerInstances'
            Resource: '*'
            Condition:
              ArnEquals:
                'ecs:cluster': !Sub 'arn:aws:ecs:${AWS::Region}:${AWS::AccountId}:cluster/${Cluster}'
      - PolicyName: cloudwatch
        PolicyDocument:
          Statement:
          - Effect: Allow
            Action: 'cloudwatch:PutMetricData'
            Resource: '*'
  SchedulableContainersLambdaPolicy:
    Type: 'AWS::IAM::Policy'
    Properties:
      Roles:
      - !Ref SchedulableContainersLambdaRole
      PolicyName: lambda
      PolicyDocument:
        Statement:
        - Effect: Allow
          Action:
          - 'logs:CreateLogStream'
          - 'logs:PutLogEvents'
          Resource: !GetAtt 'SchedulableContainersLogGroup.Arn'
  SchedulableContainersLambdaPermission2:
    Type: 'AWS::Lambda::Permission'
    Properties:
      Action: 'lambda:InvokeFunction'
      FunctionName: !Ref SchedulableContainersLambdaV2
      Principal: 'events.amazonaws.com'
      SourceArn: !GetAtt 'SchedulableContainersCron.Arn'
  SchedulableContainersLambdaV2:
    Type: 'AWS::Lambda::Function'
    Properties:
      Code:
        ZipFile: !Sub |
          'use strict';
          const AWS = require('aws-sdk');
          const ecs = new AWS.ECS({apiVersion: '2014-11-13'});
          const cloudwatch = new AWS.CloudWatch({apiVersion: '2010-08-01'});
          const CONTAINER_MAX_CPU = ${ContainerMaxCPU};
          const CONTAINER_MAX_MEMORY = ${ContainerMaxMemory};
          const CLUSTER = '${Cluster}';
          const NAMESPACE = '${AWS::StackName}';
          function list(nextToken) {
            return ecs.listContainerInstances({
              cluster: CLUSTER,
              maxResults: 1,
              nextToken: nextToken,
              status: 'ACTIVE'
            }).promise();
          }
          function describe(containerInstanceArns) {
            return ecs.describeContainerInstances({
              cluster: CLUSTER,
              containerInstances: containerInstanceArns
            }).promise();
          }
          function compute(totalSchedulableContainers, nextToken) {
            return list(nextToken)
              .then((list) => {
                return describe(list.containerInstanceArns)
                  .then((data) => {
                    const localSchedulableContainers = data.containerInstances
                      .map((instance) => ({
                        cpu: instance.remainingResources.find((resource) => resource.name === 'CPU').integerValue,
                        memory: instance.remainingResources.find((resource) => resource.name === 'MEMORY').integerValue
                      }))
                      .map((remaining) => Math.min(Math.floor(remaining.cpu/CONTAINER_MAX_CPU), Math.floor(remaining.memory/CONTAINER_MAX_MEMORY)))
                      .reduce((acc, containers) => acc + containers, 0);
                    console.log(`localSchedulableContainers ${!localSchedulableContainers}`);
                    if (list.nextToken !== null && list.nextToken !== undefined) {
                      return compute(localSchedulableContainers + totalSchedulableContainers, list.nextToken);
                    } else {
                      return localSchedulableContainers + totalSchedulableContainers;
                    }
                  });
              });
          }
          exports.handler = (event, context, cb) => {
            console.log(`Invoke: ${!JSON.stringify(event)}`);
            compute(0, undefined)
              .then((schedulableContainers) => {
                console.log(`schedulableContainers: ${!schedulableContainers}`);
                return cloudwatch.putMetricData({
                  MetricData: [{
                    MetricName: 'SchedulableContainers',
                    Dimensions: [{
                      Name: 'ClusterName',
                      Value: CLUSTER
                    }],
                    Value: schedulableContainers,
                    Unit: 'Count'
                  }],
                  Namespace: NAMESPACE
                }).promise();
              })
              .then(() => cb())
              .catch(cb);
          };
      Handler: 'index.handler'
      MemorySize: 128
      Role: !GetAtt 'SchedulableContainersLambdaRole.Arn'
      Runtime: 'nodejs8.10'
      Timeout: 60
  SchedulableContainersLogGroup:
    Type: 'AWS::Logs::LogGroup'
    Properties:
      LogGroupName: !Sub '/aws/lambda/${SchedulableContainersLambdaV2}'
      RetentionInDays: !Ref LogsRetentionInDays
  SchedulableContainersLambdaErrorsTooHighAlarm:
    Condition: HasAlertingModule
    Type: 'AWS::CloudWatch::Alarm'
    Properties:
      AlarmDescription: 'Invocations failed due to errors in the function'
      Namespace: 'AWS/Lambda'
      MetricName: Errors
      Statistic: Sum
      Period: 60
      EvaluationPeriods: 1
      ComparisonOperator: GreaterThanThreshold
      Threshold: 0
      AlarmActions:
      - 'Fn::ImportValue': !Sub '${AlertingModule}-Arn'
      Dimensions:
      - Name: FunctionName
        Value: !Ref SchedulableContainersLambdaV2
  SchedulableContainersLambdaThrottlesTooHighAlarm:
    Condition: HasAlertingModule
    Type: 'AWS::CloudWatch::Alarm'
    Properties:
      AlarmDescription: 'Invocation attempts that were throttled due to invocation rates exceeding the concurrent limits'
      Namespace: 'AWS/Lambda'
      MetricName: Throttles
      Statistic: Sum
      Period: 60
      EvaluationPeriods: 1
      ComparisonOperator: GreaterThanThreshold
      Threshold: 0
      AlarmActions:
      - 'Fn::ImportValue': !Sub '${AlertingModule}-Arn'
      Dimensions:
      - Name: FunctionName
        Value: !Ref SchedulableContainersLambdaV2
  DrainInstanceLambdaRole:
    Type: 'AWS::IAM::Role'
    Properties:
      AssumeRolePolicyDocument:
        Version: '2012-10-17'
        Statement:
        - Effect: Allow
          Principal:
            Service: 'lambda.amazonaws.com'
          Action: 'sts:AssumeRole'
      Policies:
      - PolicyName: draininstance
        PolicyDocument:
          Statement:
          - Effect: Allow
            Action:
            - 'sqs:DeleteMessage'
            - 'sqs:ReceiveMessage'
            - 'sqs:SendMessage'
            - 'sqs:GetQueueAttributes'
            Resource: !GetAtt 'AutoScalingGroupLifecycleHookQueue.Arn'
          - Effect: Allow
            Action:
            - 'ecs:ListContainerInstances'
            Resource: !GetAtt 'Cluster.Arn'
          - Effect: Allow
            Action:
            - 'ecs:updateContainerInstancesState'
            - 'ecs:listTasks'
            Resource: '*'
            Condition:
              StringEquals:
                'ecs:cluster': !GetAtt 'Cluster.Arn'
          - Effect: Allow
            Action:
            - 'autoscaling:CompleteLifecycleAction'
            - 'autoscaling:RecordLifecycleActionHeartbeat'
            Resource: !Sub 'arn:${AWS::Partition}:autoscaling:${AWS::Region}:${AWS::AccountId}:autoScalingGroup:*:autoScalingGroupName/${AutoScalingGroup}'
  DrainInstanceLambdaPolicy:
    Type: 'AWS::IAM::Policy'
    Properties:
      Roles:
      - !Ref DrainInstanceLambdaRole
      PolicyName: lambda
      PolicyDocument:
        Statement:
        - Effect: Allow
          Action:
          - 'logs:CreateLogStream'
          - 'logs:PutLogEvents'
          Resource: !GetAtt 'DrainInstanceLogGroup.Arn'
  DrainInstanceEventSourceMapping:
    DependsOn:
    - DrainInstanceLambdaPolicy
    - DrainInstanceLogGroup
    Type: 'AWS::Lambda::EventSourceMapping'
    Properties:
      BatchSize: 1
      Enabled: true
      EventSourceArn: !GetAtt 'AutoScalingGroupLifecycleHookQueue.Arn'
      FunctionName: !GetAtt DrainInstanceLambda.Arn
  DrainInstanceLambda:
    Type: 'AWS::Lambda::Function'
    Properties:
      Code:
        ZipFile: |
          'use strict';
          const AWS = require('aws-sdk');
          const ecs = new AWS.ECS({apiVersion: '2014-11-13'});
          const sqs = new AWS.SQS({apiVersion: '2012-11-05'});
          const asg = new AWS.AutoScaling({apiVersion: '2011-01-01'});
          const cluster = process.env.CLUSTER;
          const queueUrl = process.env.QUEUE_URL;
          const drainingTimeout = process.env.DRAINING_TIMEOUT;
          async function getContainerInstanceArn(ec2InstanceId) {
            console.log(`getContainerInstanceArn(${[...arguments].join(', ')})`);
            const listResult = await ecs.listContainerInstances({cluster: cluster, filter: `ec2InstanceId == '${ec2InstanceId}'`}).promise();
            return listResult.containerInstanceArns[0];
          }
          async function drainInstance(ciArn) {
            console.log(`drainInstance(${[...arguments].join(', ')})`);
            await ecs.updateContainerInstancesState({cluster: cluster, containerInstances: [ciArn], status: 'DRAINING'}).promise();
          }
          async function wait(ciArn, asgName, lchName, lcaToken, terminateTime) {
            console.log(`wait(${[...arguments].join(', ')})`);
            const payload = {
              Service: 'DrainInstance',
              Event: 'custom:DRAIN_WAIT',
              ContainerInstanceArn: ciArn,
              AutoScalingGroupName: asgName,
              LifecycleHookName: lchName,
              LifecycleActionToken: lcaToken,
              TerminateTime: terminateTime
            };
            await sqs.sendMessage({
              QueueUrl: queueUrl,
              DelaySeconds: 60,
              MessageBody: JSON.stringify(payload)
            }).promise();
          }
          async function countTasks(ciArn) {
            console.log(`countTasks(${[...arguments].join(', ')})`);
            const listResult = await ecs.listTasks({cluster: cluster, containerInstance: ciArn}).promise();
            return listResult.taskArns.length;
          }
          async function terminateInstance(asgName, lchName, lcaToken) {
            console.log(`terminateInstance(${[...arguments].join(', ')})`);
            await asg.completeLifecycleAction({
              AutoScalingGroupName: asgName,
              LifecycleHookName: lchName,
              LifecycleActionToken: lcaToken,
              LifecycleActionResult: 'CONTINUE'
            }).promise();
          }
          async function heartbeat(asgName, lchName, lcaToken) {
            console.log(`heartbeat(${[...arguments].join(', ')})`);
            await asg.recordLifecycleActionHeartbeat({
              AutoScalingGroupName: asgName,
              LifecycleHookName: lchName,
              LifecycleActionToken: lcaToken
            }).promise();
          }
          exports.handler = async function(event, context) {
            console.log(`Invoke: ${JSON.stringify(event)}`);
            const body = JSON.parse(event.Records[0].body); // batch size is 1
            if (body.Service === 'AWS Auto Scaling' && body.Event === 'autoscaling:TEST_NOTIFICATION') {
              console.log('Ignore autoscaling:TEST_NOTIFICATION')
            } else if (body.Service === 'AWS Auto Scaling' && body.LifecycleTransition === 'autoscaling:EC2_INSTANCE_TERMINATING') {
              const lcaToken = body.LifecycleActionToken;
              const ciArn = await getContainerInstanceArn(body.EC2InstanceId);
              await drainInstance(ciArn);
              await wait(ciArn, body.AutoScalingGroupName, body.LifecycleHookName, body.LifecycleActionToken, body.Time);
            } else if (body.Service === 'DrainInstance' && body.Event === 'custom:DRAIN_WAIT') {
              const taskCount = await countTasks(body.ContainerInstanceArn);
              if (taskCount === 0) {
                await terminateInstance(body.AutoScalingGroupName, body.LifecycleHookName, body.LifecycleActionToken);
              } else {
                const actionDuration = (Date.now() - new Date(body.TerminateTime).getTime()) / 1000;
                if (actionDuration < drainingTimeout) {
                  await heartbeat(body.AutoScalingGroupName, body.LifecycleHookName, body.LifecycleActionToken);
                  await wait(body.ContainerInstanceArn, body.AutoScalingGroupName, body.LifecycleHookName, body.LifecycleActionToken, body.TerminateTime);
                } else {
                  console.log('Timeout for instance termination reached.');
                  await terminateInstance(body.AutoScalingGroupName, body.LifecycleHookName, body.LifecycleActionToken);
                }
              }
            } else {
              console.log('Ignore unxpected event');
            }
          };
      Handler: 'index.handler'
      MemorySize: 128
      Role: !GetAtt 'DrainInstanceLambdaRole.Arn'
      Runtime: 'nodejs8.10'
      Timeout: 30
      Environment:
        Variables:
          CLUSTER: !Ref Cluster
          QUEUE_URL: !Ref AutoScalingGroupLifecycleHookQueue
          DRAINING_TIMEOUT: !Ref DrainingTimeoutInSeconds
      ReservedConcurrentExecutions: 1
  DrainInstanceLogGroup:
    Type: 'AWS::Logs::LogGroup'
    Properties:
      LogGroupName: !Sub '/aws/lambda/${DrainInstanceLambda}'
      RetentionInDays: !Ref LogsRetentionInDays
  DrainInstanceLambdaErrorsTooHighAlarm:
    Condition: HasAlertingModule
    Type: 'AWS::CloudWatch::Alarm'
    Properties:
      AlarmDescription: 'Invocations failed due to errors in the function'
      Namespace: 'AWS/Lambda'
      MetricName: Errors
      Statistic: Sum
      Period: 60
      EvaluationPeriods: 1
      ComparisonOperator: GreaterThanThreshold
      Threshold: 0
      AlarmActions:
      - 'Fn::ImportValue': !Sub '${AlertingModule}-Arn'
      Dimensions:
      - Name: FunctionName
        Value: !Ref DrainInstanceLambda
Outputs:
  ModuleId:
    Value: 'ecs-cluster-ec2'
  ModuleVersion:
    Value: '1.0.0'
  StackName:
    Value: !Ref 'AWS::StackName'
  Arn:
    Value: !GetAtt 'Cluster.Arn'
    Export:
      Name: !Sub '${AWS::StackName}-Arn'
  Name:
    Value: !Ref Cluster
    Export:
      Name: !Sub '${AWS::StackName}-Name'
  SecurityGroupId:
    Description: 'The Security Group Id of ECS cluster instances.'
    Value: !Ref SecurityGroup
    Export:
      Name: !Sub '${AWS::StackName}-SecurityGroupId'
  LogGroup:
    Description: 'Log group of ECS cluster.'
    Value: !Ref LogGroup
    Export:
      Name: !Sub '${AWS::StackName}-LogGroup'
oanhnn commented 5 years ago

I want donate a project about this feature. What do i can?

andreaswittig commented 5 years ago

Thanks for raising this feature request. Would you please contact us at hello@widdix.net to discuss how to sponsor this feature?