aws / aws-cdk

The AWS Cloud Development Kit is a framework for defining cloud infrastructure in code
https://aws.amazon.com/cdk
Apache License 2.0
11.7k stars 3.93k forks source link

aws-lambda: `scaleOnUtilization` should track `Max` statistic instead of `Average` #23088

Open fakocher opened 2 years ago

fakocher commented 2 years ago

Describe the bug

Lambda auto-scaling using scaleOnUtilization doesn't work properly as it is tracking Average of ProvisionedConcurrencyUtilization instead of Max.

Expected Behavior

scaleOnUtilization is tracking Max of ProvisionedConcurrencyUtilization

Current Behavior

scaleOnUtilization is tracking Average of ProvisionedConcurrencyUtilization.

This doesn't work properly and our ProvisionedConcurrencyUtilization is hitting 100% regularly.

Reproduction Steps

const autoScaling = alias.addAutoScaling({
  minCapacity: 15,
  maxCapacity: 50
});

autoScaling.scaleOnUtilization({
  utilizationTarget: 0.3
});

Possible Solution

https://github.com/aws/aws-cdk/blob/fedd51b045cb312e7259d1d20fc119c086d7d31f/packages/%40aws-cdk/aws-lambda/lib/private/scalable-function-attribute.ts#L23

Here we should use Max.

Additional Information/Context

No response

CDK CLI Version

2.49.0 (build 793dd76)

Framework Version

No response

Node.js Version

v14.21.1

OS

Linux

Language

Typescript

Language Version

No response

Other information

No response

peterwoodworth commented 2 years ago

I think to accomplish this we would need to add a new ScalingPolicy, same goes for anyone if you want to achieve this for now.

I think this would be a feature request though, as other users might want to be tracking by AVERAGE

I am marking this issue as p2, which means that we are unable to work on this immediately.

We use +1s to help prioritize our work, and are happy to revaluate this issue based on community feedback. You can reach out to the cdk.dev community on Slack to solicit support for reprioritization.

Check out our contributing guide if you're interested in contributing yourself - there's a low chance the team will be able to address this soon but we'll try to look at a PR

fakocher commented 1 year ago

I understand that changing this would break the scaling of teams that already use the method.

We are alternatively creating the scaling ourselves so this is definitely not P0.

jcuffe commented 4 days ago

I ran into this issue recently, and wanted to note that the documentation for Application Autoscaling and Lambda seem to conflict regarding how the utilization metric should be used. While the Lambda docs state "View this metric using MAX" for ProvisionedConcurrencyUtilization, Application Autoscaling's predefined metrics defines LambdaProvisionedConcurrencyUtilization as the average value.

To my understanding, the predefined metric is incorrect, and should be corrected to use the maximum value. I'm unsure how to suggest that to the appropriate team or the likelihood of this change 😅

In the meantime I plan to create a custom metric so that I can continue to use target tracking, and look into the possibility of contributing this feature upstream without breaking the existing tracking behavior for users who may prefer it.