Netflix-Skunkworks / raven-python-lambda

Sentry/Raven SDK Integration For AWS Lambda (python) and Serverless
Apache License 2.0
47 stars 15 forks source link

Performance issues in >= 0.1.5 #39

Open seangransee opened 5 years ago

seangransee commented 5 years ago

I had been running a Lambda function for months on raven-python-lambda==0.1.3dev1 and the average runtime was always around 10s. I upgraded to 0.1.10, which caused a huge increase in runtimes, as you can see in the following graph at the end of November:

screenshot 2019-01-11 16 32 16

As an experiment to see whether raven-python-lambda was responsible for this performance issue, I deployed many Lambda functions that were identical in every way except for the version of raven-python-lambda. There was a clear increase in runtime starting with 0.1.5 and every version after that.

screenshot 2019-01-11 16 29 06

Note the legend in the bottom-left corner. As you can see, the Lambda functions using 0.1.4 and below run in about 10 seconds and that time stays constant over the 6 hour period shown in this graph. The functions using 0.1.5 and above have more than double the runtime, and it's increasing over time. I've omitted the versions between 0.1.5 and 0.1.10 to make the graph more readable, but they follow the same upward trend. These are identical functions with identical settings running over identical streams of data.

In my production application, I downgraded to 0.1.4 and saw an immediate decrease in runtime which has remained steady throughout the day. Skimming through the PRs that went into 0.1.5, I couldn't find an obvious culprit.

mikegrima commented 5 years ago

Interesting... We haven't made any real updates to the library. Perhaps a dependency update is the culprit?

mikegrima commented 5 years ago

How are you making use of the sentry reporting, are you directly posting to Sentry or are you using our SQS transport?

seangransee commented 5 years ago

Posting to Sentry directly, not doing anything with SQS.

kevgliss commented 5 years ago

Perhaps Raven is the issue here? It looks like we did loosen up the version, do you have an easy way to run 0.1.5 with the older version?

https://github.com/Netflix-Skunkworks/raven-python-lambda/commit/6bf52961b948bc33492600d196ccba155cb99e25

seangransee commented 5 years ago

That seems to be a likely culprit. I'll mess around with it when I'm back at the office on Monday.

seangransee commented 5 years ago

If anyone wants to see another fun graph, I kept these Lambdas running over the weekend and now have 3 full days of data. Each point represents the average runtime (in ms) over 15 minutes, and each Lambda function gets invoked roughly 17 times per minute.

screenshot 2019-01-14 09 17 33

I'm about to do some investigation and see if Sentry's raven package is the culprit.

seangransee commented 5 years ago

From what I've gathered after about 2.5 hours of running these, raven does appear to be the culprit:

screenshot 2019-01-14 16 07 23

I'm running another experiment to figure out which version of raven started causing the slowdown. I'll report back when I have something.

seangransee commented 5 years ago

I spoke too soon. Here's the same graph from my previous comment, but with a full 24 hours of data:

screenshot 2019-01-15 15 02 24

Here's a screenshot from the function represented by the purple line in the graph, showing that it is indeed running raven==6.1.0

screenshot 2019-01-15 15 06 46

As far as I can tell from the data I've collected, the performance issue in >=0.1.5 of raven-python-lambda is unrelated to dependency updates.

Let me know if there are any further experiments I can run to help track this down.