Closed atennak1 closed 1 year ago
Thanks for submitting this @atennak1 🙏
I'm having a look at the code and sharing a couple of thoughts.
This looks great 🎉
Just one line to be tested: https://coveralls.io/builds/57227309/source?filename=lambda%2Futils.js#L166
I'll have another look and do some testing by Monday :)
I've run a few tests and it can easily take a couple of minutes for a new version to be ready, so I would increase the waiter total timeout. Say, from 5*24=120 seconds
(2 minutes) to 10*24=240 seconds
(4 minutes).
I'm checking with the Lambda team and applying this change myself asap, no action needed.
Quick update:
There are two components to the wait time: time for Lambda to create the snapshot and duration of the customer’s initialization code. The former is generally under five minutes, but occasionally can exceed that. The initialization code duration is customer-dependent and can run for up to 15 minutes.
Based on this feedback, I'm afraid the best approach would be moving the waiter to state machine, between the Initializer and the Executor (instead of within the Executor) to avoid paying for idle and potentially timing out the Executor.
But that sounds too complex - and not worth the cost of the additional state transitions for the majority of customers who're not using SnapStart. So I think we'll be fine with increasing maxAttempts
to cover the maximum invocation time for the Executor function (10*90=900seconds
).
I noticed when benchmarking Lambda's new SnapStart feature that aliases take awhile to become active. Without this change the executor step would fail due to something along the lines of "invalid function state pending". In my testing it takes upwards of 2 minutes for a power factor alias to become active when SnapStart is enabled. I'm guessing this is because Lambda needs to take a VM snapshot of every new execution-environment (power factor).
I tested this with unit tests and empirically against a SnapStart enabled Lambda function.