aws / copilot-cli

The AWS Copilot CLI is a tool for developers to build, release and operate production ready containerized applications on AWS App Runner or Amazon ECS on AWS Fargate.
https://aws.github.io/copilot-cli/
Apache License 2.0
3.51k stars 414 forks source link

deployment circuit breaker rollback #1879

Open bpottier opened 3 years ago

bpottier commented 3 years ago

I'm using the latest version which added the circuit breaker for deployment but the behavior I'm seeing is as though the circuit breaker is not being used.

I ran a "svc deploy" with an updated image that specifically runs a "CMD exit 1" in order to test the rollback. However, the stack remains in the "UPDATE_IN_PROGRESS" status and the cli continues to cycle on "Deploying".

The tasks are being spun up and killed indefinitely, like they would before the circuit breaker was implemented. This has been running for over an hour. Am I missing something or is this a bug?

efekarakus commented 3 years ago

Hi @bpottier, do you know how many tasks have failed so far? the rollback kicks in once 10 tasks have failed (https://aws.amazon.com/blogs/containers/announcing-amazon-ecs-deployment-circuit-breaker/)

bpottier commented 3 years ago

Hi @bpottier, do you know how many tasks have failed so far? the rollback kicks in once 10 tasks have failed (https://aws.amazon.com/blogs/containers/announcing-amazon-ecs-deployment-circuit-breaker/)

Looking more closely it looks like "failedTasks" is at 0. So the tasks which aren't starting aren't getting marked as failed. Is this normal behavior? Does the platform version matter? Copilot created the service as 1.3.0.

bpottier commented 3 years ago

Cancelling the stack ended up finally rolling back the service to the previous task definition. Otherwise it would have never kicked in since no "failedTasks" were being registered. I guess not an issue with copilot, but maybe ECS?

efekarakus commented 3 years ago

Yeah thanks for letting us know, you're right this is an issue related to ECS and we relayed this information to the team! Would you mind giving a 👍 to https://github.com/aws/containers-roadmap/issues/1206 it will help the product team for prioritization?

iamhopaul123 commented 3 years ago

Does the platform version matter? Copilot created the service as 1.3.0.

Random fact: platform version shouldn't matter for circuit breaker.

bpottier commented 3 years ago

Ah thank you. I gave a 👍. Also, I did get the rollback to kick in by other means and it seems there's no indication from copilot that the deployment was rolled back. It'd be nice for it to return a message saying that it rolled back and possibly return a non-zero exit status since it may be helpful for other build tools to register the rollback.

efekarakus commented 3 years ago

Ah thank you. I gave a 👍. Also, I did get the rollback to kick in by other means and it seems there's no indication from copilot that the deployment was rolled back. It'd be nice for it to return a message saying that it rolled back and possibly return a non-zero exit status since it may be helpful for other build tools to register the rollback.

You'll be able to see this in v1.2.0 :)

bpottier commented 3 years ago

Ah thank you. I gave a 👍. Also, I did get the rollback to kick in by other means and it seems there's no indication from copilot that the deployment was rolled back. It'd be nice for it to return a message saying that it rolled back and possibly return a non-zero exit status since it may be helpful for other build tools to register the rollback.

You'll be able to see this in v1.2.0 :)

You're killing it. Seriously, I've just started researching copilot to replace ecs-cli in our other projects and I'm impressed. Great job!