Closed Leonidimus closed 4 years ago
This seems to be because terraform doesn't add AmazonECSManaged as a propagated tag to the ASG itself when it links the capacity provider. There's a workaround for this by adding the following to your ASG configuration:
resource "aws_autoscaling_group" "service_asg" {
...
tag {
key = "AmazonECSManaged"
propagate_at_launch = true
}
...
}
Making this change worked for me.
@peter-boekelheide-ah did Scale-in work for you? I tried that workaround a couple of weeks ago and although AmazonECSManaged
tag was assigned to EC2 instances, the ASG was stuck in "Desired=2" and actual number running = 3 with all 3 instances still having Scale-in protection flag enabled. Maybe there is something different in the magical process when AmazonECSManaged
tag is created by the Capacity Provider - we can only guess the internal logic.
@Leonidimus I was able to get scale in and out working. Mind you my ASG was set to a starting/minimum size of 0, so my use case may have been different to yours. But currently my ASG properly scales with the demands of my cap provider.
One thing that I had to do (not sure if it mattered or if was just part of my magic chicken dance) was that I needed to use the actual resource reference to my cap_provider.name as my ecs_cluster definition's capacity_provider. When I first encountered the issue of the cycle from cluster->cap_provider->asg->launch_template->cluster, I used the actual string for name of the cap provider in my ecs cluster resource at, and this led to issues with the ASG's target tracking policy not being set up, among other things. I changed my launch template to instead use the ecs_cluster's string name in its user data to avoid the cycle and then referred to the cap provider directly in my ecs cluster resource, and that seems to have fixed that issue.
Not sure if that helps. YMMV. But I was able to finally get it working after some fiddling and gnashing of teeth.
Hi @Leonidimus and other folks 👋 Thanks for raising this.
In general, Terraform and the Terraform AWS Provider does not make any presumptions about infrastructure provisioning beyond what is directly configured. Any inherent behaviors or configuration created by layering resources on top of others must usually be accounted for in the Terraform configuration. In this case since the ECS API automatically adds the AmazonECSManaged
tag to the Auto Scaling Group when associated, the Auto Scaling Group configuration must either include that tag's configuration so its available immediately to any initial EC2 Instances and so Terraform does not try to remove it later on or there may be workarounds such as ignore_changes
to prevent Terraform from showing the tag removal as a difference. The latter can potentially cause issues similar to the original report here though, so the small documentation note mentioning ignore_changes
with the AmazonECSManaged
tag will be replaced with the configuration inclusion recommendation for clarity.
The general preference in this case should be pre-configuring the AmazonECSManaged
tag within the aws_autoscaling_group
resource, so its propagated automatically to initial EC2 Instances when min size is greater than 0 on creation (as mentioned above), e.g.
resource "aws_autoscaling_group" "example" {
# ... other configuration, potentially including other tags ...
tag {
key = "AmazonECSManaged"
propagate_at_launch = true
}
}
Any EC2 Instances as part of the Auto Scaling Group that do not have the tag can, as mentioned above, have unexpected behavior with respects to scaling. Since the original issue mentioned should be resolvable with a configuration update, but we would like to add extra documentation on this manner in the aws_ecs_capacity_provider
resource documentation, I'm going to leave this issue open until those documentation changes are merged.
This has been released in version 3.0.0 of the Terraform AWS provider. Please see the Terraform documentation on provider versioning or reach out if you need any assistance upgrading.
For further feature requests or bug reports with this functionality, please create a new GitHub issue following the template for triage. Thanks!
I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.
If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. Thanks!
We started using AWS Capacity Providers and now see the following issue: The ASG gets created and linked with Capacity Provider just fine, however it never scales down. Amazon support spotted that some EC2 instances are not tagged with AmazonECSManaged tag which is required for instances to properly register with a Capacity Provider. All the untagged instances are the ones launched at ASG creation time; subsequently launched EC2s are tagged properly.
I think it could happen due to ASG being created and populated with EC2s first, and then linked with a Capacity Provider - that would leave already launched instances untagged. The proper sequence would be to create ASG with
min_size=0
, link with Capacity Provider, then setmin_size=N
.The problem with it is The ASG never scales down, and also incorrect
CapacityProviderReservation
CloudWatch metric calculation.Community Note
Terraform Version
0.12.19
Affected Resource(s)
Terraform Configuration Files
Expected Behavior
ASG scales up and down with Capacity Provider linked
Actual Behavior
ASG scales up but never down because AmazonECSManaged is not assigned to EC2 instances launched when ASG was created.
Steps to Reproduce
Create ASG, ECS service and Capacity Provider with Terraform configuration snippets above
Important Factoids
Terminating untagged EC2 instances manually fixes the issue - ASG starts to scale down. However, it's not a feasible workaround due to a high number of deployments.