alexcasalboni / aws-lambda-power-tuning

AWS Lambda Power Tuning is an open-source tool that can help you visualize and fine-tune the memory/power configuration of Lambda functions. It runs in your own AWS account - powered by AWS Step Functions - and it supports three optimization strategies: cost, speed, and balanced.
Apache License 2.0
5.29k stars 363 forks source link

Add parameter to power-tune only cold starts #177

Closed alexcasalboni closed 2 months ago

alexcasalboni commented 1 year ago

Implement #176

Still a few tests to run, but it works :)

I have not encountered any quotas or rate limiting when creating new versions/aliases.

As expected, there's no major difference in the invocation time of the executor step, besides the cold start itself. The executor just needs to invoke the right aliases, either in series or in parallel.

Two notes:

  1. It will be highly recommended to increase the TotalExecutionTimeout parameter at deploy time (it's 5 minutes by default and it applies to all functions). Given the maximum of 15min, there is an upperbound to the total number of aliases/versions that can be created (more or less 35 per minute, for a maximum of 500).
  2. Most of the initialization time is spent on waiting on UpdateFunctionConfiguration (using an SDK waiter) and the current waiter delay is 500ms. I'll try to optimize this a bit, so we could increase the upperbound mentioned above.
alexcasalboni commented 1 year ago

Just did another test with the waiter delay set to 250ms, hoping to reduce the initialization time.

Unfortunately, I got the same numbers. Looks like I had already optimized the waiter delay to 500ms (it's 5s by default).

alexcasalboni commented 1 year ago

I'd also recommend using discardTopBottom=0. Since all invocations are cold starts, there's no need to discard outlier results.

@Parro do you think it makes sense to automatically set discardTopBottom to 0 when onlyColdStarts=true? I'd rather not, so you can still choose whether to discard something or not. At the same time, I can't see why you'd want to discard anything in this case.

Parro commented 1 year ago

@Parro do you think it makes sense to automatically set discardTopBottom to 0 when onlyColdStarts=true? I'd rather not, so you can still choose whether to discard something or not. At the same time, I can't see why you'd want to discard anything in this case.

I think that are still be cases there a lambda could have some values that should be trimmed, for example when it depends from a remote service, that could take longer to respond in some calls. So I agree with youI think that we should let the user choose if trim or not the result.

Parro commented 1 year ago

I was thinking, and if we invert the logic and instead of onlyColdStarts we use a parameter noColdStarts? In this way, in a common scenario the tests are always made with the cold lambdas, and if the user wants to concentrate only on the results of hot lambdas, it could set the noColdStarts to true and the step functions will behave as it does now. With PR #173 lambda execution time and cold start are separated, we could even plot the two number in the graph.

We should also explain in the docs that with noColdStarts false the execution is slower, so the user can decide to give up cold start data to have more speed.

alexcasalboni commented 1 year ago

@Parro I'd rather keep the current behaviour as the default one. Considering only cold starts is an "edge case" that makes sense only in specific situations. For the largest majority of cases (and invocations), you want to power-tune your functions for warm invocations.

Parro commented 1 year ago

@Parro I'd rather keep the current behaviour as the default one. Considering only cold starts is an "edge case" that makes sense only in specific situations. For the largest majority of cases (and invocations), you want to power-tune your functions for warm invocations.

Ok, maybe I am biased by my need to improve cold start of my lambdas 😉

alexcasalboni commented 2 months ago

Closing this, as we're prioritizing the approach of #206, mainly because it removed the upperbound limit to the # of versions/aliases you can create and because it solves another issue related to SnapStart.