Closed booi closed 2 months ago
this sounds like a bug for the upstream AWS provider
this sounds like a bug for the upstream AWS provider
Hi @bryantbiggs, could you explain how it might be an upstream AWS provider issue? The solution I mentioned does work in the AWS provider, it just needs to be configured in this module.
because of this statement (if its true):
But if you deploy a specific tag and then rerun terraform, terraform picks up that the image_identifier has changed and wants to reset it back to latest
Using the ignore changes lifecycle is a very heavy handed solution and I suspect nearly everyone using this module would not want that behavior
because of this statement (if its true):
But if you deploy a specific tag and then rerun terraform, terraform picks up that the image_identifier has changed and wants to reset it back to latest
Using the ignore changes lifecycle is a very heavy handed solution and I suspect nearly everyone using this module would not want that behavior
Most people do not use terraform for deployment so I'm not sure how that can be true. Is there an alternative solution?
How do you know that?
Well professionally I've worked for a number of companies from small to large (3000+) that all use terraform to provision and configure cloud resources but generally hand off deployment to a CI/CD like github actions, jenkins etc.
Terraform is inherently more risky than a typical deployment as it affects cloud infrastructure and certainly isn't the recommended way to do deployments with ECS or k8s. It would also require you to commit the new image tag to git from a terraform action which isn't that common and doesn't play well in larger multiuser environment.
Have you seen different?
Disclaimer: I work at AWS as a container specialist with our container service teams (EKS, ECS, Lambda, etc.)
Have you seen different?
I have seen quite a bit 😬
Terraform is inherently more risky than a typical deployment as it affects cloud infrastructure and certainly isn't the recommended way to do deployments with ECS or k8s. It would also require you to commit the new image tag to git from a terraform action which isn't that common and doesn't play well in larger multiuser environment.
These are all very subjective statements. However, there are various parts where static configurations meet dynamically changing values (artifact versions and autoscaling values are the two most common culprits). You can see some of these pains here and here
In short, to support these we'd have to do a hacky thing by supporting dual resources - one with a lifecycle ignore block, and another without. Its a pattern that I abhor and am reluctant to support if at all possible. Instead, I would encourage users to explore alternative avenues, such as using an ECR data source and pulling the image value from there as the input to your AppRunner module configuration, or sometimes SSM parameter store works just as well
I would encourage users to explore alternative avenues, such as using an ECR data source and pulling the image value from there as the input to your AppRunner module configuration
OHH I’m not sure why I didn’t think of that. That’s definitely a better approach. Let me see if that’s doable and if there’s a “first-run” issue. Thanks!
@bryantbiggs I did a bunch of digging into this as it did come up again and I'm kind of at a loss. I tried to use an aws_ecr_image
data query to insert the correct image into the image configuration which seems to work 'ok'-ish with 2 fairly major problems.
latest
tag or (2) the most recent
image. Both have their shortcomings. (1) AppRunner only actually deploys if the deployed image tag is different than the one currently running. So the actual apprunner service will not be using the latest
tag. Even if we also tag the current image with latest
, that means terraform will force apprunner to redeploy to the latest
tag even if it's the same causing terraform to run for at least 2-5 minutes waiting for the service to redeploy. (2) the most recent image is not necessarily the "current" image. CI/CD will build and push a container even if it doesn't intend on running it (e.g. a dev or test container). I'm not sure how this module fits into a production environment with these shortcomings. Let me know if you think there's a different workaround. I do feel like an optional lifecycle
policy on the image configuration would solve all these problems.
(2) the most recent image is not necessarily the "current" image. CI/CD will build and push a container even if it doesn't intend on running it (e.g. a dev or test container).
We segregate our ECR infrastructure to facilitate this, Dev / QA / UAT images are pushed to a different ECR, and only Approved Releases that are merged to main are built and pushed to the "Live / Production" ECR.
We also employ a fix-forward strategy, so generally don't rollback to previous container versions, most of the time those are caught by staging before making it to live.
We don't use the latest
tag, but instead the short hash from the repo, which also corresponds to "app version". In our case most_recent
works nicely.
Deployments are handled by Buildkite.
This issue has been automatically marked as stale because it has been open 30 days with no activity. Remove stale label or comment or this issue will be closed in 10 days
This issue was automatically closed because of stale in 10 days
I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.
Is your request related to a new offering from AWS?
No
Is your request related to a problem? Please describe.
Yes. When using apprunner in image mode you supply the
image_identifier
which includes the tag. Typically docs will just writesomeorg/someproject:latest
.However, in production you usually deploy a specific tag (in our case a semver or git sha).
someorg/someproject:v1.5
. This is to prevent accidentally deploying the wrong version due to a race condition in whoever sets thelatest
tag last.But if you deploy a specific tag and then rerun terraform, terraform picks up that the
image_identifier
has changed and wants to reset it back tolatest
.Describe the solution you'd like.
An option that would effectively add a
lifecycle
ignore_changes
to theimage_identifier
field. That way terraform can do the initial provision but subsequent deployments can change theimage_identifier
without bringing terraform out of sync.Describe alternatives you've considered.
We switched our deploys to tag
latest
right before deployment. This doesn't eliminate the issue but reduces the window that it could be an issue.Additional context