Open rapphil opened 1 year ago
Cool! Glad we're addressing this issue. :-)
Hi, is there any timeline or proposals in planning or in place to address the issues shown via metrics on cold start being recorded?
Just ICYMI, Java agent has fast startup support for Lambda: https://github.com/open-telemetry/opentelemetry-lambda/tree/main/java#fast-startup-for-java-agent
Is your feature request related to a problem? Please describe.
A Lambda cold start happens when a new instance of a Lambda function must be created and initialized. The cold start refers to the delay between invocation and runtime created by the initialization process. New instances needs to be initialized whenever other instances have expired due to inactivity or when there are more invocations than active instances. Cold starts are an inherent problem with Lambda functions because it is not possible to keep lambda initialized forever.
The OpenTelemetry SDK was not created with Lambda functions in mind. If you use OpenTelemetry inside a lambda function, the overhead of initializing the SDK and optionally auto instrumenting the application code adds up in the cold start time. This is specially painful for users because this inserts high latency in their application and increases the cost of running the lambdas.
Describe the solution you'd like
This proposal will tackle the cold start time of the OpenTelemetry lambda layers with the following plan:
Plan:
Methodology for measuring the cold startup:
Measure the cold start time for lambdas with and without the layers for each supported layer.
Methodology for profiling the lambda functions:
TBD - We will need to
Additional context References
https://github.com/open-telemetry/opentelemetry-lambda/issues/263