aws / aws-cdk

The AWS Cloud Development Kit is a framework for defining cloud infrastructure in code
https://aws.amazon.com/cdk
Apache License 2.0
11.41k stars 3.8k forks source link

aws-ecs: Updating capacity provider strategies on EC2 service causes replacement #29826

Open nrhtr opened 3 months ago

nrhtr commented 3 months ago

Describe the bug

When attempting to update a service which has no capacity provider strategy (i.e. defaulted to EC2 launch type) to have a capacity provider strategy, CDK reports that it will replace the service rather than update-in-place.

This is problematic for two reasons: 1) This will result in downtime, which should not be necessary given that ECS appears to be able to update this in-place with a new deployment. 2) (Perhaps less important, or specific to my use-case) The service replacement doesn't work, and Cloudformation fails with an error that the service "already exists".``

Expected Behavior

I expect the capacity provider strategy would be updated in-place without requiring a service replacement.

Current Behavior

CDK/Cloudformation attempts to recreate the service (with the same name), and fails.

The output from cdk diff

[~] AWS::ECS::Service Service/Service ServiceD69D759B replace
 ├─ [+] CapacityProviderStrategy
 │   └─ [{"Base":0,"CapacityProvider":"bkt-stage-ecs-capcity-association-ExtraLargeCapacityProvider-hMg2vMaHZjvB","Weight":1}]
 ├─ [~] DeploymentConfiguration
 │   └─ [+] Added: .Alarms
 ├─ [-] LaunchType (requires replacement)
 │   └─ EC2
 └─ [~] DependsOn
     └─ @@ -1,3 +1,5 @@
        [ ] [
        [+]   "Ec2TaskTaskRole88E481A9",
        [+]   "Ec2TaskTaskRoleDefaultPolicy8BFDC019",
        [ ]   "Listener828B0E81"
        [ ] ]

Fails to deploy:

bkt-sandbox-event-service: deploying...
[0%] start: Publishing 575056c4a80a4675f729f94d81d3c941abe89eb2e79980b1f66510651c49b041:741749084370-ap-southeast-2
[100%] success: Published 575056c4a80a4675f729f94d81d3c941abe89eb2e79980b1f66510651c49b041:741749084370-ap-southeast-2
bkt-sandbox-event-service: creating CloudFormation changeset...
bkt-sandbox-event-service | 0/6 | 1:48:57 AM | UPDATE_IN_PROGRESS   | AWS::CloudFormation::Stack                  | bkt-sandbox-event-service User Initiated
bkt-sandbox-event-service | 0/6 | 1:49:01 AM | CREATE_IN_PROGRESS   | AWS::ApiGateway::Deployment                 | APIGatewayDeployment (APIGatewayDeployment1E68EF6D6fcf8ac110bb22c4e72adc6284a4a82c) 
bkt-sandbox-event-service | 0/6 | 1:49:04 AM | CREATE_IN_PROGRESS   | AWS::ApiGateway::Deployment                 | APIGatewayDeployment (APIGatewayDeployment1E68EF6D6fcf8ac110bb22c4e72adc6284a4a82c) Resource creation Initiated
bkt-sandbox-event-service | 1/6 | 1:49:04 AM | CREATE_COMPLETE      | AWS::ApiGateway::Deployment                 | APIGatewayDeployment (APIGatewayDeployment1E68EF6D6fcf8ac110bb22c4e72adc6284a4a82c) 
bkt-sandbox-event-service | 1/6 | 1:49:04 AM | UPDATE_IN_PROGRESS   | AWS::ECS::TaskDefinition                    | Ec2Task (Ec2TaskB165294F) Requested update requires the creation of a new physical resource; hence creating one.
bkt-sandbox-event-service | 1/6 | 1:49:06 AM | UPDATE_IN_PROGRESS   | AWS::ECS::TaskDefinition                    | Ec2Task (Ec2TaskB165294F) Resource creation Initiated
bkt-sandbox-event-service | 2/6 | 1:49:06 AM | UPDATE_COMPLETE      | AWS::ECS::TaskDefinition                    | Ec2Task (Ec2TaskB165294F) 
bkt-sandbox-event-service | 2/6 | 1:49:08 AM | UPDATE_IN_PROGRESS   | AWS::ECS::Service                           | Service/Service (ServiceD69D759B) Requested update requires the creation of a new physical resource; hence creating one.
bkt-sandbox-event-service | 2/6 | 1:49:09 AM | UPDATE_FAILED        | AWS::ECS::Service                           | Service/Service (ServiceD69D759B) Resource handler returned message: "Resource of type 'AWS::ECS::Service' with identifier 'event-service-sandbox' already exists." (RequestToken: 37bce0e0-69a9-8a5d-865f-f222d914da26, HandlerErrorCode: AlreadyExists)
bkt-sandbox-event-service | 2/6 | 1:49:10 AM | UPDATE_ROLLBACK_IN_P | AWS::CloudFormation::Stack                  | bkt-sandbox-event-service The following resource(s) failed to update: [ServiceD69D759B]. 
bkt-sandbox-event-service | 1/6 | 1:49:13 AM | UPDATE_COMPLETE      | AWS::ECS::TaskDefinition                    | Ec2Task (Ec2TaskB165294F) 
bkt-sandbox-event-service | 2/6 | 1:49:13 AM | UPDATE_COMPLETE      | AWS::ECS::Service                           | Service/Service (ServiceD69D759B) 
bkt-sandbox-event-service | 3/6 | 1:49:15 AM | UPDATE_ROLLBACK_COMP | AWS::CloudFormation::Stack                  | bkt-sandbox-event-service 
bkt-sandbox-event-service | 3/6 | 1:49:16 AM | DELETE_IN_PROGRESS   | AWS::ApiGateway::Deployment                 | APIGatewayDeployment (APIGatewayDeployment1E68EF6D6fcf8ac110bb22c4e72adc6284a4a82c) 
bkt-sandbox-event-service | 2/6 | 1:49:16 AM | DELETE_COMPLETE      | AWS::ECS::Service                           | Service/Service (ServiceD69D759B) 
bkt-sandbox-event-service | 2/6 | 1:49:17 AM | DELETE_IN_PROGRESS   | AWS::ECS::TaskDefinition                    | Ec2Task (Ec2TaskB165294F) 
bkt-sandbox-event-service | 1/6 | 1:49:17 AM | DELETE_COMPLETE      | AWS::ApiGateway::Deployment                 | APIGatewayDeployment (APIGatewayDeployment1E68EF6D6fcf8ac110bb22c4e72adc6284a4a82c) 
bkt-sandbox-event-service | 0/6 | 1:49:18 AM | DELETE_COMPLETE      | AWS::ECS::TaskDefinition                    | Ec2Task (Ec2TaskB165294F) 
bkt-sandbox-event-service | 1/6 | 1:49:18 AM | UPDATE_ROLLBACK_COMP | AWS::CloudFormation::Stack                  | bkt-sandbox-event-service 
Failed resources:
bkt-sandbox-event-service | 1:49:09 AM | UPDATE_FAILED        | AWS::ECS::Service                           | Service/Service (ServiceD69D759B) Resource handler returned message: "Resource of type 'AWS::ECS::Service' with identifier 'event-service-sandbox' already exists." (RequestToken: 37bce0e0-69a9-8a5d-865f-f222d914da26, HandlerErrorCode: AlreadyExists)
 ❌  bkt-sandbox-event-service failed: Error: The stack named bkt-sandbox-event-service failed to deploy: UPDATE_ROLLBACK_COMPLETE: Resource handler returned message: "Resource of type 'AWS::ECS::Service' with identifier 'event-service-sandbox' already exists." (RequestToken: 37bce0e0-69a9-8a5d-865f-f222d914da26, HandlerErrorCode: AlreadyExists)
    at FullCloudFormationDeployment.monitorDeployment (/opt/atlassian/pipelines/agent/build/node_modules/aws-cdk/lib/api/deploy-stack.ts:496:13)
    at processTicksAndRejections (node:internal/process/task_queues:96:5)
    at deployStack2 (/opt/atlassian/pipelines/agent/build/node_modules/aws-cdk/lib/cdk-toolkit.ts:241:24)
    at /opt/atlassian/pipelines/agent/build/node_modules/aws-cdk/lib/deploy.ts:39:11
    at run (/opt/atlassian/pipelines/agent/build/node_modules/p-queue/dist/index.js:163:29)
 ❌ Deployment failed: Error: Stack Deployments Failed: Error: The stack named bkt-sandbox-event-service failed to deploy: UPDATE_ROLLBACK_COMPLETE: Resource handler returned message: "Resource of type 'AWS::ECS::Service' with identifier 'event-service-sandbox' already exists." (RequestToken: 37bce0e0-69a9-8a5d-865f-f222d914da26, HandlerErrorCode: AlreadyExists)
    at deployStacks (/opt/atlassian/pipelines/agent/build/node_modules/aws-cdk/lib/deploy.ts:61:11)
    at processTicksAndRejections (node:internal/process/task_queues:96:5)
    at CdkToolkit.deploy (/opt/atlassian/pipelines/agent/build/node_modules/aws-cdk/lib/cdk-toolkit.ts:314:7)
    at initCommandLine (/opt/atlassian/pipelines/agent/build/node_modules/aws-cdk/lib/cli.ts:357:12)

Reproduction Steps

  1. Create an ECS service with no capacity provider strategy.
  2. Attempt to update the service to specify a capacity provider.

Possible Solution

No response

Additional Information/Context

I'm currently using this as a workaround:

// Workaround to avoid recreating ECS service when updating capacity provider strategies
// Note: Even if we were OK with the downtime for this, it doesn't work because ECS tries to
// first create a new service which conflicts with the existing (not deleted) one.
const cfnService = service.node.defaultChild as ecs.CfnService
cfnService.capacityProviderStrategy = [
  {
    capacityProvider,
    weight: 1,
  },
]

CDK CLI Version

2.44.0 (build bf32cb1)

Framework Version

No response

Node.js Version

v20.11.1

OS

macOS Sonoma 14.2.1

Language

TypeScript

Language Version

No response

Other information

No response

khushail commented 3 months ago

Hi @nrhtr , thanks for reaching out. I see that you are using a very old version of CDK could you please try running it on CDK 2.137 and see if the issue persists.

nrhtr commented 3 months ago

Hi @khushail, sorry, I forgot to mention that I also tested with 2.137.0 (build bb90b4c) with the same result.