getsentry / sentry-dart

Sentry SDK for Dart and Flutter
https://sentry.io/for/flutter/
MIT License
766 stars 239 forks source link

Investigate SDK back pressure issues #2360

Closed denrase closed 1 week ago

denrase commented 1 month ago

Description

A user reported that calling sentry capture methods in fast succession is leading to issues, especially with screenshot integration.

Investigate if calling the SDK mehods in tight loops can be improved. One idea was to introduce a debounce for screenshots.

The minimal repro steps I got are attached, but this is a heavily contrived example aimed at easily reproducing the issue. In practice this would look like 30 API requests in short succession that all fail due to no internet connectivity or where all endpoints return an error code, where Sentry then creates exceptions in short succession.

Image

buenaflor commented 1 month ago

fyi, the real example could lead up to 75 api calls due to heavy retries (for example with bad internet connection)

denrase commented 1 month ago

Looking at the code, we do drop the envelopes due to back pressure, but this is probably happening too late, as we only do this before calling transport.

https://github.com/getsentry/sentry-dart/blob/136c365a518666b435c0638af8bcd45b2be3e57a/dart/lib/src/sentry_client.dart#L634

Image

We could either move this to an earlier place in the processing queue, or introduce additional measured, like the mentioned debounce for the screenshot widget. The latter is probably a good addition anyway.

denrase commented 1 month ago

@buenaflor Added a PR for widget debounce. Do you think this will resolve this issue or should we introduce additional measures?

buenaflor commented 1 month ago

We could either move this to an earlier place in the processing queue

I think this might also be a good solution so we don't trigger all the event processors and thus decrease overhead, right?

Added a PR for widget debounce

👍 maybe might be good to check out with the flutter profiler to see if there is any other event processor or similar that is creating overhead when calling captureException in a tight loop