newrelic / newrelic-react-native-agent

New Relic Mobile Agent SDK for React-Native Applications
Apache License 2.0
16 stars 21 forks source link

Android native crashes - SIGSEGV #196

Closed MFazio23 closed 5 days ago

MFazio23 commented 2 months ago

Description

We are experiencing a massive amount of crashes in our production application and we finally determined that it's happening due to the inclusion of the NewRelic RN agent library. The crash is showing up in a few different forms, but all look to be tied back to the same issue.

These crashes happen primarily on Motorola devices (almost 90%) though not exclusively.

We temporarily removed NR from our latest production release and the crashes completely went away.

Steps to Reproduce

We've been able to reproduce this on a few Motorola devices by a combination of backgrounding our app and opening another high-memory app like the camera. It doesn't happen every time you background the app (like this issue) but we can have it happen pretty consistently.

I'm working on getting a reproducible example I can share.

Expected Behavior

The app should not crash with NewRelic turned on.

Relevant Logs / Console output

Many of the stack traces look something like this:

13:29:59.218  A  Fatal signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 0x300c03c0300c03 in tid 6616 (hades), pid 6426
13:30:00.424  A  *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
13:30:00.424  A  Build fingerprint: 'motorola/fogo_g/fogo:14/U1UFNS34.41-71-1/3362a1-665f82:user/release-keys'
13:30:00.424  A  Revision: 'pvt'
13:30:00.424  A  ABI: 'arm64'
13:30:00.424  A  Timestamp: 2024-08-22 13:29:59.531322138-0500
13:30:00.424  A  Process uptime: 265s
13:30:00.425  A  pid: 6426, tid: 6616, name: hades
13:30:00.425  A  uid: 10469
13:30:00.425  A  signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 0x00300c03c0300c03
13:30:00.425  A      x0  00000073417ef700  x1  000000733ec00390  x2  00000074d3405df8  x3  000000724a4b4c00
13:30:00.425  A      x4  0000000000000000  x5  00000074d30cedc0  x6  00000074b7b703ec  x7  0000000000000000
13:30:00.425  A      x8  0000000000000001  x9  b40000724a4b4810  x10 b40000724a4b4c00  x11 c0300c03c0300c03
13:30:00.425  A      x12 c0300c03c0300c03  x13 0000007573ad5740  x14 0000000004000000  x15 0000000000000028
13:30:00.425  A      x16 0000000000000001  x17 0000007573bad410  x18 0000007417834000  x19 000000733ec00818
13:30:00.425  A      x20 0000000000000001  x21 00000073417eeff0  x22 00000073417ef700  x23 0000000000000040
13:30:00.425  A      x24 00000073417ef740  x25 000000000000001e  x26 0000007341800000  x27 b400007436133500
13:30:00.425  A      x28 000000000010a400  x29 00000074b7b70ab0
13:30:00.425  A      lr  000000741c826db4  sp  00000074b7b70a70  pc  000000741c795650  pst 0000000080001000
13:30:00.425  A  8 total frames
13:30:00.425  A  backtrace:
13:30:00.425  A        #00 pc 0000000000165650  /data/app/~~HrRm2TLaVd9NFaAl3mzngg==/base.apk!libhermes.so (offset 0x3672000) (BuildId: 7b331e66c1a76e127578ead45dce2fe9a7fda535)
13:30:00.425  A        #01 pc 00000000001f6db0  /data/app/~~HrRm2TLaVd9NFaAl3mzngg==/base.apk!libhermes.so (offset 0x3672000) (BuildId: 7b331e66c1a76e127578ead45dce2fe9a7fda535)
13:30:00.425  A        #02 pc 00000000002000a0  /data/app/~~HrRm2TLaVd9NFaAl3mzngg==/base.apk!libhermes.so (offset 0x3672000) (BuildId: 7b331e66c1a76e127578ead45dce2fe9a7fda535)
13:30:00.425  A        #03 pc 00000000001fe768  /data/app/~~HrRm2TLaVd9NFaAl3mzngg==/base.apk!libhermes.so (offset 0x3672000) (BuildId: 7b331e66c1a76e127578ead45dce2fe9a7fda535)
13:30:00.425  A        #04 pc 00000000001ff7f8  /data/app/~~HrRm2TLaVd9NFaAl3mzngg==/base.apk!libhermes.so (offset 0x3672000) (BuildId: 7b331e66c1a76e127578ead45dce2fe9a7fda535)
13:30:00.425  A        #05 pc 00000000001ff5c0  /data/app/~~HrRm2TLaVd9NFaAl3mzngg==/base.apk!libhermes.so (offset 0x3672000) (BuildId: 7b331e66c1a76e127578ead45dce2fe9a7fda535)
13:30:00.425  A        #06 pc 00000000000fcf74  /apex/com.android.runtime/lib64/bionic/libc.so (__pthread_start(void*)+208) (BuildId: 284d65da9c7eadcb4f58fe07ba016b9f)
13:30:00.425  A        #07 pc 0000000000096924  /apex/com.android.runtime/lib64/bionic/libc.so (__start_thread+68) (BuildId: 284d65da9c7eadcb4f58fe07ba016b9f)

Your Environment

Additional context

We've been working with the Hermes team for a while on this issue as well to no avail before finding that removing NR resolved the issue.

We also attempted to turn off instrumentation of some packages using excludePackageInstrumentation(...) but that only fixed the issue if we excluded the com.newrelic.agent.android package which obviously didn't allow us to continue using NR.

ladp333 commented 2 months ago

We upgraded the NR agent to the latest version when we upgraded RN to 0.74.5 in our React Native app a couple of days ago. Since then, we have experienced numerous crashes. Fortunately, this was only in the internal beta.

This is not the first time we’ve had issues with this SDK. Considering how common these use cases are, I can’t believe how the team releases the SDK. They probably don’t test it properly. If they did, they wouldn’t miss these bugs. To be honest, I have lost confidence in this SDK and have decided to disable it for now. I’m sorry to say that the quality of this SDK is very low, and I didn’t expect such a poor-quality product from New Relic. Either the team needs to step up their game or hand it over to a more capable team within the company. We are paying top dollar to New Relic.

Fatal Exception: java.lang.NoSuchMethodError: No virtual method stop()Z in class Lcom/newrelic/agent/android/ndk/AgentNDK; or its super classes (declaration of 'com.newrelic.agent.android.ndk.AgentNDK' appears in /data/app/~~UXfp2ZvKcxmNz4l0MYNIOg==/com.originmobileapp-et4-xbWwrBKQebcKoc_EfQ==/base.apk!classes5.dex) at com.newrelic.agent.android.ndk.NativeReporting.stop(NativeReporting.java:108) at com.newrelic.agent.android.ndk.NativeReporting.shutdown(NativeReporting.java:53) at com.newrelic.agent.android.AndroidAgentImpl.stop(AndroidAgentImpl.java:589) at com.newrelic.agent.android.AndroidAgentImpl.stop(AndroidAgentImpl.java:521) at com.newrelic.agent.android.AndroidAgentImpl.applicationBackgrounded(AndroidAgentImpl.java:681) at com.newrelic.agent.android.background.ApplicationStateMonitor.notifyApplicationInBackground(ApplicationStateMonitor.java:102) at com.newrelic.agent.android.background.ApplicationStateMonitor.lambda$uiHidden$0$com-newrelic-agent-android-background-ApplicationStateMonitor(ApplicationStateMonitor.java:63) at com.newrelic.agent.android.background.ApplicationStateMonitor$$ExternalSyntheticLambda0.run(:2) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:644) at java.lang.Thread.run(Thread.java:1012)

MFazio23 commented 2 months ago

@ladp333 - yep, that's a different issue we've seen as well. The discussion for that crash is happening here: https://github.com/newrelic/newrelic-react-native-agent/issues/183

ladp333 commented 2 months ago

@MFazio23 Yeh I know. Apols, I was so frustrated with this sdk so I didn'i care about putting my comment in the right thread . I just dumped my frustration -:)

YuriLima23 commented 2 months ago

when trasition my application the foreground to background very fast many times the crash happens

https://github.com/user-attachments/assets/4f3e9d75-5c79-4c0d-b829-9e6050603fd5

ndesai-newrelic commented 2 months ago

@ladp333 @MFazio23 @YuriLima23 the issue is fixed in 1.4.4.

Regarding the Recent Application Crash Issue

We sincerely apologize for the inconvenience caused by the recent application crashes when transitioning to the background. We understand your frustration, and we'd like to provide a detailed explanation of what occurred and the steps we're taking to prevent similar issues in the future.

Root Cause

The issue arose due to a complex interaction between our Android and NDK agent releases:

  1. We use a "+" convention for version numbering of both Android and NDK agents, which automatically pulls the latest versions during builds.
  2. The Android and NDK agents were released in quick succession.
  3. Our testing did not catch the scenario where a newer NDK agent was paired with an older Android agent.

This led to a method not found error, initially misdiagnosed as an NDK-side issue. Our attempt to resolve it by reverting NDK changes in an NDK agent release did not address the core problem: a version mismatch between the Android and NDK agents.

Our Response and Future Prevention

To prevent similar issues in the future, we are implementing the following measures:

  1. More comprehensive testing scenarios on version compatibility between components.
  2. Modifying our release process to ensure thorough validation of all agent combinations.
  3. Reconsidering our use of the "+" convention for version numbers to have more predictable builds.

Moving Forward

We value your trust and are committed to delivering a stable and reliable React Native agent. We will conduct more thorough testing for future releases to prevent such issues from recurring.

If you need any additional information or clarification regarding this issue, please don't hesitate to reach out. We're here to address any concerns you may have.

Thank you for your patience and understanding as we work to improve our product and processes.

arelstone commented 1 month ago

@ndesai-newrelic will this require us to to anything than bumping the version to 1.4.4?

MFazio23 commented 1 month ago

@ndesai-newrelic - appreciate the update here and the improved process on your end.

The updated NR agent version fixed https://github.com/newrelic/newrelic-react-native-agent/issues/183 but does not resolve the current issue. I ran another build yesterday with the new library version and I'm seeing the same issue as before. To be clear, this issue is a SIGSEGV that occurs occasionally (primarily on Motorola devices), not a consistent crash every time the app is backgrounded.

What other info can I get for you to help troubleshoot what's going on here? Also, could you please re-open this issue as it is not resolved?

ndesai-newrelic commented 1 month ago

@MFazio23 can you disable nativeCrashReportingEnabled for android?

ndesai-newrelic commented 1 month ago

@MFazio23 is your issue fixed?

MFazio23 commented 1 month ago

@MFazio23 is your issue fixed?

I'm thinking so (it was in local testing), but I'm waiting on a production release to completely confirm. Will update after that's done.

danarchos commented 1 month ago

Issue is still occurring for us even with latest version 1.4.5 @ndesai-newrelic

vlimag commented 1 month ago

can you disable nativeCrashReportingEnabled for android?

@ndesai-newrelic We gonna lose crash reporting data for android this way?

ndesai-newrelic commented 1 month ago

@vlimag nope, it will disable functionality of newrelic NDK agent which records C++ crash.

vlimag commented 1 month ago

fyi setting nativeCrashReportingEnabled as false fixed for us @ndesai-newrelic thanks

MFazio23 commented 5 days ago

@MFazio23 is your issue fixed?

I can finally confirm that setting nativeCrashReportingEnabled did indeed fix the issue here. Thank you again!