getsentry / sentry-javascript

Official Sentry SDKs for JavaScript
https://sentry.io
MIT License
7.87k stars 1.55k forks source link

Dynamically handle AWS lambda timeout warnings based on event trigger #9951

Open StianHanssen opened 8 months ago

StianHanssen commented 8 months ago

Problem Statement

We noticed that Sentry uses the AWS lambda context's getRemainingTimeInMillis() function to determine when to give a timeout warning. This countdown is based on the configured timeout for the lambda. That means that if I set that timeout to 500 seconds, trigger the lambda with API Gateway (time limit max 30 seconds), and the lambda times out, we will get no warning.

We use AWS lambdas that are triggered by multiple sources. Depending on the trigger the max time limit for the lambda differs. For instance:

The issue is that we can't configure time limits for each trigger. In our case, we want a lambda that can be triggered by both API Gateway and invoked by other lambdas. In the case of being invoked by other lambdas, we would prefer a longer time limit.

What would have been nice is if Sentry would detect which trigger (inspecting the event and/or context) is being used and cap the time limit accordingly. So even if we set the time limit to 500 seconds, when it is triggered by API Gateway the time limit is capped to 30 seconds. That way one always gets timeout warnings consistently.

Solution Brainstorm

We have an idea on how to achieve this by making a wrapper function for tracking count-down using the original context count-down:

const newTimeRemaining = () => newTimeout - (oldTimeout - context.getRemainingTimeInMillis())

In this case, newTimeout would be set based on the trigger. So for instance, looking at the event and seeing that event?.requestContext?.http is defined in the case of API Gateway, thereby setting newTimeout = 30000. oldTimeout would be the configured time limit (for example 500000 ms from my initial scenario above). One would also only use this function if newTimeout < oldTimeout.

The tricky part is that I am not sure if we can know oldTimeout without pushing an environment variable with that information. A possible solution is to immediately on lambda startup call const oldTimeout = context.getRemainingTimeInMillis(), though I am afraid it would be slightly imprecise..

Your thoughts on this would be much appreciated! I imagine some users may be confused when there is no timeout warning because they didn't configure their timeout according to the implicit timeout imposed by the trigger. I hope there is a possibility for a new feature here.

lforst commented 8 months ago

Putting on backlog. We likely won't get to this any time soon. PRs are welcome!