aws / aws-toolkit-visual-studio

AWS Toolkit for Visual Studio - a plugin to interact with AWS
https://aws.amazon.com/visualstudio/
Apache License 2.0
110 stars 29 forks source link

Deploy to ECS using Fargate always stuck on creating ECS Service #293

Closed achmadmulyadi closed 1 year ago

achmadmulyadi commented 2 years ago

Hi, I have updated the toolkit to 1.36.00 and got stuck each time I deploy my project to ECS. I'm not sure if this issue related to aws toolkit or latest change in aws on processor quota. All I remember was everything seems to be fine before those 2 updates.

Steps, Publish->New Traget->Choose Fargate->Edit Settings to use an existing VPC. ( everything else remains as default).->Click Publish.

Log aws log.txt

Screenshots image

Computer (please complete the following information):

Windows Version: Windows 11 Visual Studio Version: VS 2022 AWS Toolkit for Visual Studio Version: 1.36.00

awschristou commented 2 years ago

It looks like the CloudFormation deployment waited three hours for the Fargate service to create and enter a stable state. The key deployment failure from the log appears to be:

Recipe/AppFargateService/Service (RecipeAppFargateService71FD6243) Service arn:aws:ecs:ap-southeast-3:175195474598:service/DevelopmentRediAutomotiveBlazorServer/DevelopmentRediAutomotiveBlazorServer-service did not stabilize.

Is there a chance that your service is not starting up successfully, causing Fargate to continuously re-launch it? https://aws.amazon.com/premiumsupport/knowledge-center/cloudformation-ecs-service-stabilize/ has a potential (temporary workaround), but it is worth finding out if there is an underlying root cause to fix.

Pinging @philasmar in case he has additional insights.

achmadmulyadi commented 2 years ago

It's kinda weird that my service is actually already in a steady state but the CloudFormation unable to read that status. I can even access my application from the browser, but then the CloudFormation will rollback the deployment after timeout occurred.

image

Is it normal that the task resource has been completed but the service still CREATE_ON_PROGRESS ? image

philasmar commented 2 years ago

Since you tested this with an existing VPC, could you try to create a new deployment with a new VPC? I'm curious if the existing VPC that you are using has anything to do with it.

achmadmulyadi commented 2 years ago

Hi @philasmar

I've tried to deploy to new VPC, no luck with same error tho. Any other possible workaround I can try?

achmadmulyadi commented 2 years ago

Here's the log when creating the service:

{ "eventVersion": "1.08", "userIdentity": { "type": "AssumedRole", "principalId": "AROASRSTWZKTFBCGLXSOM:AWSCloudFormation", "arn": "arn:aws:sts::175195474598:assumed-role/cdk-hnb659fds-cfn-exec-role-175195474598-ap-southeast-3/AWSCloudFormation", "accountId": "175195474598", "accessKeyId": "ASIASRSTWZKTOBXO5LWH", "sessionContext": { "sessionIssuer": { "type": "Role", "principalId": "AROASRSTWZKTFBCGLXSOM", "arn": "arn:aws:iam::175195474598:role/cdk-hnb659fds-cfn-exec-role-175195474598-ap-southeast-3", "accountId": "175195474598", "userName": "cdk-hnb659fds-cfn-exec-role-175195474598-ap-southeast-3" }, "webIdFederationData": {}, "attributes": { "creationDate": "2022-10-29T01:18:31Z", "mfaAuthenticated": "false" } }, "invokedBy": "cloudformation.amazonaws.com" }, "eventTime": "2022-10-29T01:20:14Z", "eventSource": "ecs.amazonaws.com", "eventName": "CreateService", "awsRegion": "ap-southeast-3", "sourceIPAddress": "cloudformation.amazonaws.com", "userAgent": "cloudformation.amazonaws.com", "requestParameters": { "clientToken": "diAutomotiveBlazorServer-service", "cluster": "DevRediAutomotiveBlazorServer", "deploymentConfiguration": { "maximumPercent": 200, "minimumHealthyPercent": 50 }, "desiredCount": 1, "enableECSManagedTags": false, "enableExecuteCommand": false, "healthCheckGracePeriodSeconds": 60, "launchType": "FARGATE", "loadBalancers": [ { "targetGroupArn": "arn:aws:elasticloadbalancing:ap-southeast-3:175195474598:targetgroup/DevRe-Recip-1BVMPHP8YOBVZ/169c77a97c336cae", "containerName": "AppContainerDefinition", "containerPort": 80 } ], "networkConfiguration": { "awsvpcConfiguration": { "assignPublicIp": "ENABLED", "securityGroups": [ "sg-0532b8d03c0657717" ], "subnets": [ "subnet-08413dc9b864b96c8", "subnet-0886ce387be8ad3b6", "subnet-01c38fe7a30c45cc8" ] } }, "placementConstraints": [], "placementStrategy": [], "serviceName": "DevRediAutomotiveBlazorServer-service", "serviceRegistries": [], "tags": [ { "key": "aws-dotnet-deploy", "value": "AspNetAppEcsFargate" } ], "createdBy": "arn:aws:iam::175195474598:role/cdk-hnb659fds-cfn-exec-role-175195474598-ap-southeast-3", "taskDefinition": "arn:aws:ecs:ap-southeast-3:175195474598:task-definition/DevRediAutomotiveBlazorServerRecipeAppTaskDefinitionAD244FDC:38" }, "responseElements": { "service": { "serviceArn": "arn:aws:ecs:ap-southeast-3:175195474598:service/DevRediAutomotiveBlazorServer/DevRediAutomotiveBlazorServer-service", "serviceName": "DevRediAutomotiveBlazorServer-service", "clusterArn": "arn:aws:ecs:ap-southeast-3:175195474598:cluster/DevRediAutomotiveBlazorServer", "loadBalancers": [ { "targetGroupArn": "arn:aws:elasticloadbalancing:ap-southeast-3:175195474598:targetgroup/DevRe-Recip-1BVMPHP8YOBVZ/169c77a97c336cae", "containerName": "AppContainerDefinition", "containerPort": 80 } ], "serviceRegistries": [], "status": "ACTIVE", "desiredCount": 1, "runningCount": 0, "pendingCount": 0, "launchType": "FARGATE", "platformVersion": "LATEST", "platformFamily": "Linux", "taskDefinition": "arn:aws:ecs:ap-southeast-3:175195474598:task-definition/DevRediAutomotiveBlazorServerRecipeAppTaskDefinitionAD244FDC:38", "deploymentConfiguration": { "deploymentCircuitBreaker": { "enable": false, "rollback": false }, "maximumPercent": 200, "minimumHealthyPercent": 50 }, "deployments": [ { "id": "ecs-svc/4522459054729015200", "status": "PRIMARY", "taskDefinition": "arn:aws:ecs:ap-southeast-3:175195474598:task-definition/DevRediAutomotiveBlazorServerRecipeAppTaskDefinitionAD244FDC:38", "desiredCount": 1, "pendingCount": 0, "runningCount": 0, "failedTasks": 0, "createdAt": "Oct 29, 2022 1:20:14 AM", "updatedAt": "Oct 29, 2022 1:20:14 AM", "launchType": "FARGATE", "platformVersion": "1.4.0", "platformFamily": "Linux", "networkConfiguration": { "awsvpcConfiguration": { "assignPublicIp": "ENABLED", "securityGroups": [ "sg-0532b8d03c0657717" ], "subnets": [ "subnet-0886ce387be8ad3b6", "subnet-08413dc9b864b96c8", "subnet-01c38fe7a30c45cc8" ] } }, "rolloutState": "IN_PROGRESS", "rolloutStateReason": "ECS deployment ecs-svc/4522459054729015200 in progress.", "failedLaunchTaskCount": 0, "replacedTaskCount": 0 } ], "roleArn": "arn:aws:iam::175195474598:role/aws-service-role/ecs.amazonaws.com/AWSServiceRoleForECS", "version": 0, "events": [], "createdAt": "Oct 29, 2022 1:20:14 AM", "placementConstraints": [], "placementStrategy": [], "networkConfiguration": { "awsvpcConfiguration": { "assignPublicIp": "ENABLED", "securityGroups": [ "sg-0532b8d03c0657717" ], "subnets": [ "subnet-0886ce387be8ad3b6", "subnet-08413dc9b864b96c8", "subnet-01c38fe7a30c45cc8" ] } }, "healthCheckGracePeriodSeconds": 60, "schedulingStrategy": "REPLICA", "deploymentController": { "type": "ECS" }, "tags": [ { "key": "aws-dotnet-deploy", "value": "AspNetAppEcsFargate" } ], "createdBy": "arn:aws:iam::175195474598:role/cdk-hnb659fds-cfn-exec-role-175195474598-ap-southeast-3", "enableECSManagedTags": false, "propagateTags": "NONE", "enableExecuteCommand": false } }, "requestID": "e3970bee-d5fe-4314-9314-b16fddff6ba7", "eventID": "081f1836-4fee-4413-ab51-fc02fb63ff9f", "readOnly": false, "eventType": "AwsApiCall", "managementEvent": true, "recipientAccountId": "175195474598", "eventCategory": "Management" }

achmadmulyadi commented 2 years ago

Hi @awschristou and @philasmar

I can confirm this behavior only for Jakarta region as I have tested deploying my project to Singapore region and everything works well as expected.

awschristou commented 2 years ago

@achmadmulyadi thank you for the insight. @philasmar , is there a chance that the deploy tool is using an older version of one or more AWS SDK packages, for example, a version that predates when Fargate was added to Jakarta?

philasmar commented 2 years ago

There are multiple AWS SDK references that could use an update. I will put out a PR soon to go over the package references and update them.

philasmar commented 2 years ago

@achmadmulyadi I have updated the AWS SDK references to the latest versions. Could you please try the latest version of the CLI https://github.com/aws/aws-dotnet-deploy/releases/tag/1.7.3 and check if the issue is still there?

achmadmulyadi commented 2 years ago

@philasmar I have updated my CLI to 1.7.3 as you can see from the screenshot, but the issue still there, either with default VPC or new VPC.

image

philasmar commented 2 years ago

Are you able to share the logs of the ECS Container? That will help us figure out what's going on

achmadmulyadi commented 2 years ago

My container doesn't have any instance, is there any way I can get the logs from it? My CloudTrail is active but I'm not sure if it has detailed info you need.

philasmar commented 2 years ago

I'd like to set up a virtual debugging session to better understand your issue. Is that something you'd be open to?

achmadmulyadi commented 2 years ago

Sure, I'd love to. Just let me know how we are going to do that, I can adjust the timing according to your location. I'm at GMT +7 btw.

philasmar commented 2 years ago

I have sent you an email with more instructions on how to schedule a session.

achmadmulyadi commented 1 year ago

Sorry for the late reply @philasmar but now seems everything works well as expected without me doing anything. I suppose there are some updates in Jakarta region internally. I think we can close issue as I already tried it several times without any issue.

philasmar commented 1 year ago

That's great to hear! I'll close this issue.