aws / aws-cdk

The AWS Cloud Development Kit is a framework for defining cloud infrastructure in code
https://aws.amazon.com/cdk
Apache License 2.0
11.71k stars 3.94k forks source link

(aws-ecs): autoScaleTaskCount and scaleOnCpuUtilization don't add scaling policies to ECS service #14297

Open mimozell opened 3 years ago

mimozell commented 3 years ago

I'm trying to set up target scaling for an ECS service following the documentation, but it doesn't seem to be doing anything.

Reproduction Steps

        service.autoScaleTaskCount(
            EnableScalingProps.builder()
                .minCapacity(app.minInstanceCount)
                .maxCapacity(app.maxInstanceCount)
                .build()
        ).scaleOnCpuUtilization(
            "cpuScaling",
            CpuUtilizationScalingProps.builder()
                .scaleInCooldown(app.scaleInCooldown)
                .scaleOutCooldown(app.scaleOutCooldown)
                .targetUtilizationPercent(app.targetCpuUtilizationPercent)
                .build()
        )

What did you expect to happen?

I expect to see something like this: Screenshot 2021-04-21 at 09 47 36

What actually happened?

Instead I see: Screenshot 2021-04-21 at 09 42 14

Environment

Other

What this looks like in CloudFormation: Screenshot 2021-04-21 at 10 33 00


This is :bug: Bug Report

robertkarlsson commented 3 years ago

We deployed one of our services using aws-cdk version 1.100.0 around 2 hours ago and it lost all of it's autoscaling configuration and it won't come back when redeploying that service. We made no changes to our cdk codebase. We use scaleOnCpuUtilization which is a cloudwatch metric and it still works for the production stack of that service (which has not been redeployed today).

I can see on aws notices that:

[4:08 AM PDT] We have identified the cause of the increased faults for CloudWatch GetMetricData API in the EU-CENTRAL-1 Region. We expect to resolve this issue in the next few hours. You are able to view Cloudwatch Data if you set your time range to a setting that is less than 3 hours.

I think GetMetricData and scaleOnCpuUtilization are connected and therefore this issue might be related to that if you are in that region.

mimozell commented 3 years ago

Aha, thanks. I'm in the Ireland region (eu-west-1) and don't have any such notification, so I think it might be something else.

robertkarlsson commented 3 years ago

I can confirm that the issue was not resolved when the CustomMetrics API became stable again.

We solved the issue by removing the autoscaling policy from the CDK, deploying, adding it back and then deploying again.

Our configuration looks like this with 1.100.0 running node 14.16.1

    const autoScale = this.service.autoScaleTaskCount({
      minCapacity: 1,
      maxCapacity: 2,
    });

    apiAutoScale.scaleOnCpuUtilization("CPUAutoscaling", {
      targetUtilizationPercent: 50,
      scaleInCooldown: cdk.Duration.seconds(30),
      scaleOutCooldown: cdk.Duration.seconds(30),
    });

Our Cloudformation resources looks the same as yours.

mimozell commented 3 years ago

Removing and re-adding made no difference for me :( And neither did doing the exact same with 1.100.0.

FelixRelli commented 3 years ago

Probably related: For scheduledTasks (scaleOnSchedule) the scaling is working, but they do not show under Scheduled Tasks in the UI.

    const scaling = ecsService.autoScaleTaskCount({ maxCapacity: 1 });
    scaling.scaleOnSchedule('StartVectorizationTask', {
      schedule: Schedule.cron({ minute: '10' }),
      maxCapacity: 1,
      minCapacity: 1,
    });
    scaling.scaleOnSchedule('StopVectorizationTask', {
      schedule: Schedule.cron({ minute: '20' }),
      maxCapacity: 0,
      minCapacity: 0,
    });

image

cdk v. 1.129.0

awsiv commented 2 years ago

This is because the ClusterName now contains ARN, but Cloudwatch alarm (for autoscaling) doesn't like it.

If you change to using cluster name, it should be fine - however this requires an additional step to change the cluster name. CDK should handle it properly

image

github-actions[bot] commented 1 year ago

This issue has not received any attention in 1 year. If you want to keep this issue open, please leave a comment below and auto-close will be canceled.

morishin commented 11 months ago

This problem is still alive...

sakurai-ryo commented 10 months ago

Hi @morishin I would like to fix this problem, but the code in the issue you created did not reproduce it. If you don't mind, could you please let me know what code I can reproduce with Copy & Past? https://github.com/aws/aws-cdk/issues/28300

sharathvignesh commented 8 months ago

I encountered an interesting issue. In my case, the cluster is imported and when I try to add my service with autoscaling to it, it does not add any autoscaling configuration.

But, If I create the cluster in the same application instead of importing it, then autoscaling configuration can be seen on the console.

jflitton commented 4 months ago

TLDR; clusterName on the Cluster construct is an ARN. Full explanation follows:

I have this issue as well. The CDK appears to successfully create the scalable target and policy, but they don't appear in the ECS web console. The issue appears to be that the resourceId is incorrectly formatted.

Here's the format of the resourceId generated by the CDK: service/arn:aws:ecs:<region>:<account-number>:cluster/<cluster-name>/<service-name>

And here's the format of resourceId when I use the AWS CLI (which works as expected): service/<cluster-name>/<service-name>

Here's the typescript I used to work around this issue. This code uses the CDK ScalableTarget construct directly so that I can manually specify the resourceId:

function getFargateServiceResourceId(cluster: ICluster, service: FargateService) {
  const clusterName = cluster.clusterName.replace(/.*cluster\//, '');
  return `service/${clusterName}/${service.serviceName}`;
}

const scalableTarget = new ScalableTarget(this, 'ScalableTarget', {
    serviceNamespace: ServiceNamespace.ECS,
    resourceId: getFargateServiceResourceId(ecsCluster, ecsService.service),
    scalableDimension: 'ecs:service:DesiredCount',
    minCapacity: 1,
    maxCapacity: 5,
});

scalableTarget.scaleToTrackMetric('ScalingPolicy', {
    targetValue: 70,
    predefinedMetric: PredefinedMetric.ECS_SERVICE_AVERAGE_CPU_UTILIZATION,
});
GavinZZ commented 1 month ago

Can someone try again with the latest CDK version? v2.160.0+.

I tried to reproduce this issue using the following CDK app which matches the same code provided by the original requester and multiple comments. I was able to deploy the CDK app and when I check the ECS console page, I do see the following showing up in my console.

Screenshot 2024-10-15 at 11 20 27

I also checked @jflitton's comment and I found different behaviour. The resourceId field CDK code uses is resourceId:service/${this.cluster.clusterName}/${this.serviceName}, and clsuter.clusterName is set to this.clusterName = this.getResourceNameAttribute(resource.ref) which is equivalent to !Ref Cluster in CFN template. According to offcial CFN documentation, this should return the cluster name instead of cluster ARN, see https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-ecs-cluster.html#aws-resource-ecs-cluster-return-values

export class CdkAppStack extends cdk.Stack {
  constructor(scope: Construct, id: string, props?: cdk.StackProps) {
    super(scope, id, props);
    const vpc = new ec2.Vpc(this, 'Vpc', { maxAzs: 2, restrictDefaultSecurityGroup: false });

    const clusterNew = new ecs.Cluster(this, 'FargateCluster', { vpc });
    const cluster = ecs.Cluster.fromClusterArn(this, 'MyCluster', clusterNew.clusterArn)

    const taskDefinition = new ecs.FargateTaskDefinition(this, 'TaskDef', {
      memoryLimitMiB: 1024,
      cpu: 512,
    });

    taskDefinition.addContainer('web', {
      image: ecs.ContainerImage.fromRegistry('amazon/amazon-ecs-sample'),
      portMappings: [{
        containerPort: 80,
        protocol: ecs.Protocol.TCP,
      }],
    });

    const service = new ecs.FargateService(this, 'Service', {
      cluster,
      taskDefinition,
    });

    const autoScale = service.autoScaleTaskCount({
      minCapacity: 1,
      maxCapacity: 2,
    });

    autoScale.scaleOnCpuUtilization("CPUAutoscaling", {
      targetUtilizationPercent: 50,
      scaleInCooldown: cdk.Duration.seconds(30),
      scaleOutCooldown: cdk.Duration.seconds(30),
    });

    new cdk.CfnOutput(this, 'ResourceId', { value: `service/${cluster.clusterName}/${service.serviceName}` });

  }
}