jamie-chapman commented 1 year ago

Terraform Core Version

1.3.3

AWS Provider Version

4.66.1

Affected Resource(s)

aws_appautoscaling_policy

aws_cloudwatch_metric_alarm

Expected Behavior

Expected an aws_appautoscaling_policy to be applied and then a aws_cloudwatch_metric_alarm to be applied and attached to the autoscaling policy to monitor some MQ metrics and scale up/down based on the metrics.

Actual Behavior

When applying an autoscaling policy for an ECS service, the service itself deploys fine, and the autoscaling policy gets added as well, but the aws_cloudwatch_metric_alarm is meant to also be applied and attached to the autoscaling policy. Note that there are 2 policies and 2 cloudwatch metric alarms for each service to scale in and out.

There is no error at the end of the apply however the aws_cloudwatch_metric_alarm never gets attached to the autoscaling policy even with a depends_on to the aws_appautoscaling_policy resource and the arn of the autoscaling policy added in the alarm_actions input like so:

alarm_actions = [aws_appautoscaling_policy.scale_in.arn]

Then when running a 2nd apply there will be the aws_cloudwatch_metric_alarm in the plan and the aws_appautoscaling_policy wants to be replaced with a new arn and finally gets applied.

This is an issue as on first glance it looks like autoscaling has been added correctly when it hasn't. Perhaps it is due to the aws_appautoscaling_policy not supporting aws_cloudwatch_metric_alarms as the example in the docs for aws_cloudwatch_metric_alarm gives a aws_autoscaling_policy as the example, not aws_appautoscaling_policy.

Relevant Error/Panic Output Snippet

There was no error message just aws_cloudwatch_metric_alarm resources aws_cloudwatch_metric_alarm that silently didn't get attached to the autoscaling policy.

Terraform Configuration Files

We are using terragrunt.

Steps to Reproduce

Create an ECS service with 2 aws_appautoscaling_policy resources attached to scale the DesiredCount of tasks in the service up or down, with a aws_cloudwatch_metric_alarm attached to the alarm_actions of each autoscaling policy. Then try to apply these resources and then apply again. If you don't see the issue at first try updating the task definition of the service and then apply, it seems that the arns are not updated properly.

Debug Output

No response

Panic Output

No response

Important Factoids

No response

References

https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/appautoscaling_policy#alarm_arns

Would you like to implement a fix?

None

github-actions[bot] commented 1 year ago

Community Note

Voting for Prioritization

Please vote on this issue by adding a 👍 reaction to the original post to help the community and maintainers prioritize this request.
Please see our prioritization guide for information on how we prioritize.
Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request.

Volunteering to Work on This Issue

If you are interested in working on this issue, please leave a comment.
If this would be your first contribution, please review the contribution guide.

justinretzolk commented 1 year ago

Hey @jamie-chapman 👋 Can you supply a sample Terraform configuration and debug logs (redacted as needed) so that we have the necessary information in order to look into this?

jamie-chapman commented 1 year ago

Hey @justinretzolk I've just looked at some other issues and I see what is meant by configuration now, sorry here you go:

resource "aws_appautoscaling_policy" "scale_in" {
  policy_type        = "StepScaling"
  name               = "mq-scaling-in-example-service"
  resource_id        = "service/example-env/example-service"
  scalable_dimension = "ecs:service:DesiredCount"
  service_namespace  = "ecs"

  step_scaling_policy_configuration {
    adjustment_type         = "ChangeInCapacity"
    metric_aggregation_type = "Average"

    step_adjustment {
      metric_interval_upper_bound = 0
      scaling_adjustment          = -1
    }
    cooldown = 300
  }
}

resource "aws_cloudwatch_metric_alarm" "mq_scale_in" {
  for_each = toset(["queue1", "queue2"])

  alarm_name          = "StepScaling-MQQueueScaleIn-example-service-${each.key}"
  comparison_operator = "LessThanThreshold"
  evaluation_periods  = "3"
  threshold           = "20"
  alarm_description   = "AmazonMQ autoscale IN alarm example-service for queue ${each.key}"
  alarm_actions       = [aws_appautoscaling_policy.scale_in.arn]

  metric_query {
    id          = "messageready"
    label       = "Messages Ready in '${each.key}' queue"
    return_data = true
    metric {
      dimensions = {
        Broker      = "example-broker"
        Queue       = each.key
        VirtualHost = "/"
      }
      namespace   = "AWS/AmazonMQ"
      metric_name = "MessageReadyCount"
      period      = 60
      stat        = "Average"
    }
  }

  depends_on = [aws_appautoscaling_policy.scale_in]
}

As for debug logs I'll see if I can get those for you asap

jamie-chapman commented 1 year ago

Hi @justinretzolk unfortunately I can't get my hands on any logs besides the screenshots that we posted above. But reading around the current issues, this one is almost exactly the same behaviour that I'm seeing

https://github.com/hashicorp/terraform-provider-aws/issues/31261

jamie-chapman commented 1 year ago

I suspect that something similar is happening to our configuration. We did have this in place but I don't think that would have fixed the issue at all, and we aren't using aws_appautoscaling_target, we are using aws_appautoscaling_policy in conjunction with aws_cloudwatch_metric_alarm where perhaps this same issue is happening and hasn't been fixed by the update in the issue above.

lifecycle { #TODO: Remove this once THE BUG IS FIXED
    ignore_changes = [step_scaling_policy_configuration]
  }

jamie-chapman commented 11 months ago

Hi @justinretzolk is there any progress with this bug, I wonder if it has been seen by anyone else? We have removed these resources from out Terraform state for now as the deployment is still not applying the resources correctly as expected.

xpl-m-bocian commented 1 month ago

Hello everyone, Issue seems to still exist with the current AWS Provider version (5.61.0). :c @jamie-chapman , any news perhaps? Cheers!

hashicorp / terraform-provider-aws

[Bug]: `aws_cloudwatch_metric_alarm` doesn't apply on first try with `aws_appautoscaling_policy` arn in `alarm_actions` #31329

Terraform Core Version

AWS Provider Version

Affected Resource(s)

Expected Behavior

Actual Behavior

Relevant Error/Panic Output Snippet

Terraform Configuration Files

Steps to Reproduce

Debug Output

Panic Output

Important Factoids

References

Would you like to implement a fix?

Community Note