bugsnag / bugsnag-android

BugSnag crash monitoring and reporting tool for Android apps
https://www.bugsnag.com/platforms/android/
Other
1.19k stars 205 forks source link

Client.leaveBreadcrumb is slow() #1974

Open pyricau opened 7 months ago

pyricau commented 7 months ago

Bug description

As measured on Square's hardware, Client.leaveBreadcrumb() is unreasonably slow (for what it's supposed to do, which is store a breadcrumb in a ring buffer).

Data

Here are the results, based on 175 measurements:

image

As you can see, the duration has a wide spread (up to 1 ms!) and the mode is around 173us. Even without the worst case scenario, that's a large amount of time for just adding an entry to a ring buffer.

I started looking into this after noticing that Client.leaveBreadcrumb() was showing up in our simpleperf traces. Anecdotally, it seemed like the time was spent in native code, through the JNI boundary.

Steps to reproduce

I measured the time it takes to call just Client.leaveBreadcrumb() as I navigated around our app with the following code:

measureTime {
  bugsnagClient.leaveBreadcrumb(message, metadata, MANUAL)
}

I speed compiled our app to ensure there's no jitting slowing things down with adb shell cmd package compile -f -m speed com.squareup.

Then I navigated a bunch in our app, which triggered the logging of breadcrumbs

Environment

lemnik commented 7 months ago

Hi @pyricau,

Thanks very much for the detailed issue, you're absolutely right that leaveBreadcrumb has a very high overhead for what it's doing. Most of this overhead indeed comes from the NDK synchronization side of things right now. The breadcrumbs data is copied several times (and in some cases is pre-json marshalled, since the native layer doesn’t currently support Lists and Maps directly) and the native ring-buffer is guarded by a global mutex, all of which is of course non-optimal.

This is an area we're actively working on at the moment (the NDK plugin also doesn't obey the maxBreadcrumbs configuration due to its current design). As part of these changes we're going to try and reduce the general overhead of breadcrumbs (removing much of the copying and locking).

For now it’s worth avoiding leaving “complex” breadcrumb metadata where it can be avoided, while we work to reduce the overhead more generally.

lemnik commented 5 months ago

Hi again @pyricau

Just to let you know we've released the first improvements to the breadcrumb performance work as part of v6.4.0.

This first change will only affect breadcrumbs that have complex metadata, and was caused by a regex pattern being compiled for each metadata item instead of being shared.

We're also working on some more general performance improvements that will affect all breadcrumbs synchronized to the native layer.

pyricau commented 5 months ago

thank you!