getsentry / sentry-dotnet

Sentry SDK for .NET
https://docs.sentry.io/platforms/dotnet
MIT License
582 stars 206 forks source link

JNI DETECTED ERROR IN APPLICATION when reporting a crash in .Net MAUI Android App #3627

Open dinisvieira-xo opened 5 days ago

dinisvieira-xo commented 5 days ago

Package

Sentry.Maui

.NET Flavor

.NET

.NET Version

8.0.8

OS

Android

SDK Version

4.11.0

Self-Hosted Sentry Version

No response

Steps to Reproduce

This might be an hard to reproduce issue outside of our app. Basically I started having "random" crashes after 5 to 15 minutes of launching the app. During this time the app is reporting crashes normally but suddenly crashes with something like "JNI DETECTED ERROR IN APPLICATION: JNI ERROR (app bug): jobject is an invalid global reference: 0x979a" followed by a native crash that didn't offer a lot of information.

I enabled GREF Logging and reproduced the issue to find more information on where the issue is occurring. After reproducing a few times the issue always seemed to point at Sentry.Maui.Internal.MauiDeviceData.ApplyMauiDeviceData as a related cause of this.

Important: If I comment the code that calls the SentrySdk.CaptureException the app runs with no crashes for hours.

Additional Info:

Sentry Options:

Debug = false;
AutoSessionTracking = true;
IsGlobalModeEnabled = false;
StackTraceMode = StackTraceMode.Enhanced;
DiagnosticLevel = SentryLevel.Debug;
TracesSampleRate = 1.0;
SampleRate = 1.0F;
MaxBreadcrumbs = 200;
CreateElementEventsBreadcrumbs = false;
IncludeBackgroundingStateInBreadcrumbs = false;
IncludeTextInBreadcrumbs = false;
IncludeTitleInBreadcrumbs = false;
SendDefaultPii = false;
AttachScreenshot = false;

Expected Result

App should not crash and continue reporting/logging to Sentry normally.

Actual Result

After 5 to 15 minutes of using the app it crashes at "random" moments when trying to report an issue to Sentry.

Stack Trace that I managed to get using GREF:

handle 0x15f36; key_handle 0x950a496: Java Type: `android/view/WindowManagerImpl`; MCW type: `Java.Lang.Object`
+g+ grefc 11514 gwrefc 0 obj-handle 0x6ff6179805/I -> new-handle 0x1460e/G from thread '.NET TP Worker'(3)
   at Android.Runtime.AndroidObjectReferenceManager.CreateGlobalReference(JniObjectReference value)
   at Java.Interop.JniObjectReference.NewGlobalRef()
   at Android.Runtime.JNIEnv.NewGlobalRef(IntPtr jobject)
   at Android.Runtime.AndroidValueManager.AddPeer(IJavaPeerable value, IntPtr handle, JniHandleOwnership transfer, IntPtr& handleField)
   at Java.Lang.Object.SetHandle(IntPtr value, JniHandleOwnership transfer)
   at Java.Lang.Object..ctor(IntPtr handle, JniHandleOwnership transfer)
   at Android.Net.NetworkInfo..ctor(IntPtr javaReference, JniHandleOwnership transfer)
   at System.Reflection.RuntimeConstructorInfo.InternalInvoke(RuntimeConstructorInfo , Object , IntPtr* , Exception& )
   at System.Reflection.MethodBaseInvoker.InterpretedInvoke_Constructor(Object obj, IntPtr* args)
   at System.Reflection.MethodBaseInvoker.InvokeDirectByRefWithFewArgs(Object obj, Span`1 copyOfArgs, BindingFlags invokeAttr)
   at System.Reflection.MethodBaseInvoker.InvokeWithFewArgs(Object obj, BindingFlags invokeAttr, Binder binder, Object[] parameters, CultureInfo culture)
   at System.Reflection.RuntimeConstructorInfo.Invoke(BindingFlags invokeAttr, Binder binder, Object[] parameters, CultureInfo culture)
   at System.Reflection.ConstructorInfo.Invoke(Object[] parameters)
   at Java.Interop.TypeManager.CreateProxy(Type type, IntPtr handle, JniHandleOwnership transfer)
   at Java.Interop.TypeManager.CreateInstance(IntPtr handle, JniHandleOwnership transfer, Type targetType)
   at Java.Lang.Object.GetObject(IntPtr handle, JniHandleOwnership transfer, Type type)
   at Java.Lang.Object._GetObject[NetworkInfo](IntPtr handle, JniHandleOwnership transfer)
   at Java.Lang.Object.GetObject[NetworkInfo](IntPtr handle, JniHandleOwnership transfer)
   at Android.Net.ConnectivityManager.GetNetworkInfo(Network network)
   at Microsoft.Maui.Networking.ConnectivityImplementation.get_NetworkAccess()
   at Microsoft.Maui.Networking.Connectivity.get_NetworkAccess()
   at Sentry.Maui.Internal.MauiDeviceData.ApplyMauiDeviceData(Device device, IDiagnosticLogger logger)
   at Sentry.Maui.Internal.SentryMauiEventProcessor.Process(SentryEvent event)
   at Sentry.Extensibility.ISentryEventProcessorExtensions.DoProcessEvent(ISentryEventProcessor processor, SentryEvent event, SentryHint hint)
   at Sentry.SentryClient.DoSendEvent(SentryEvent event, SentryHint hint, Scope scope)
   at Sentry.SentryClient.CaptureEvent(SentryEvent event, Scope scope, SentryHint hint)
   at Sentry.Internal.Hub.CaptureEvent(SentryEvent evt, Scope scope, SentryHint hint)
   at Sentry.Internal.Hub.CaptureEvent(SentryEvent evt, SentryHint hint, Action`1 configureScope)
   at Sentry.Internal.Hub.CaptureEvent(SentryEvent evt, Action`1 configureScope)
   at Sentry.HubExtensions.CaptureException(IHub hub, Exception ex, Action`1 configureScope)
   at Sentry.SentrySdk.CaptureException(Exception exception, Action`1 configureScope)
   at Prism.Plugin.Logging.Sentry.SentryLoggingService.Report(Exception ex, IDictionary`2 properties, String filePath, String memberName, Int32 lineNumber)
   at Prism.Plugin.Logging.AggregateLogger.<>c__DisplayClass12_0.<Report>b__0(ILogger l)
   at Prism.Plugin.Logging.AggregateLogger.InvokeOnLogger(Action`1 logAction)
   at Prism.Plugin.Logging.AggregateLogger.Report(Exception ex, IDictionary`2 properties, String filePath, String memberName, Int32 lineNumber)
(... Stack Trace continues with App code that originated the report that we want to log ...)
jamescrosswell commented 4 days ago

Hi @dinisvieira-xo ,

Thank you for the detailed investigation! I wasn't aware that you could get a stack trace from GREF logs 👍🏻

The stack trace you supplied seems to correspond to line 60 below: https://github.com/getsentry/sentry-dotnet/blob/f4d21b7e5a5835f45a4d02d33d2c902148388ff7/src/Sentry.Maui/Internal/MauiDeviceData.cs#L57-L65

However the stack trace also doesn't show any exception. The last two operations on the stack are:

handle 0x15f36; key_handle 0x950a496: Java Type: `android/view/WindowManagerImpl`; MCW type: `Java.Lang.Object`
+g+ grefc 11514 gwrefc 0 obj-handle 0x6ff6179805/I -> new-handle 0x1460e/G from thread '.NET TP Worker'(3)
   at Android.Runtime.AndroidObjectReferenceManager.CreateGlobalReference(JniObjectReference value)

So a call to Android.Runtime.AndroidObjectReferenceManager.CreateGlobalReference resulting in a android/view/WindowManagerImpl being created at memory address 0x15f36 (if I'm reading that correctly).

I think whatever is crashing your application is coming from somewhere else then...

dinisvieira-xo commented 4 days ago

@jamescrosswell

If I understand this correctly that stack won't show the crash exception, it just shows what that thread is doing when the crash occurs. For me to relate that Thread/Trace with the crash I'm using "invalid global reference handle" that the JNI crash itself gives me in the "normal" Logcat log. And I use that handle to search the latest reference on the Gref output.

This is what the "normal" crash looks like on Logcat/output. (in this example the handle would be 0xab156)

java_vm_ext.cc:591] JNI DETECTED ERROR IN APPLICATION: JNI ERROR (app bug): jobject is an invalid global reference: 0xab156 (deleted reference at index 21898)
java_vm_ext.cc:591]     in call to IsInstanceOf
runtime.cc:691] Runtime aborting...
runtime.cc:691] Dumping all threads without mutator lock held
runtime.cc:691] All threads:
runtime.cc:691] DALVIK THREADS (75):
runtime.cc:691] "main" prio=10 tid=1 Native
runtime.cc:691]   | group="" sCount=1 ucsCount=0 flags=1 obj=0x7375de58 self=0xb400006f47672400
runtime.cc:691]   | sysTid=7996 nice=-10 cgrp=top-app sched=0/0 handle=0x6ffdfa1500
runtime.cc:691]   | state=S schedstat=( 57074772681 4194296289 37276 ) utm=5176 stm=530 core=2 HZ=100
runtime.cc:691]   | stack=0x7fe8cb6000-0x7fe8cb8000 stackSize=8188KB
runtime.cc:691]   | held mutexes=
runtime.cc:691]   native: #00 pc 000edc78  /apex/com.android.runtime/lib64/bionic/libc.so (__epoll_pwait+8) (BuildId: 59cafcd8c6514064584f920f18ea8760)
runtime.cc:691]   native: #01 pc 00018a4c  /system/lib64/libutils.so (android::Looper::pollInner+188) (BuildId: cb91686656dc7d3bb5a8e813db3c244f)
runtime.cc:691]   native: #02 pc 0001892c  /system/lib64/libutils.so (android::Looper::pollOnce+124) (BuildId: cb91686656dc7d3bb5a8e813db3c244f)
(...)

That said, I have continued the investigation and found out that if I control the access to SentrySdk.CaptureException by using a Monitor/Lock I don't get the crash anymore. Example below:

try
{
    Monitor.Enter(_monitorObj);
    _loggerService.Log(message, properties ?? new Dictionary<string, string>());
}
finally
{
    Monitor.Exit(_monitorObj);
}

The "feeling" I'm getting with all of this is that the issue is somewhere in Microsoft.Maui Device/Display information and Sentry just happens to be in the "middle". I did notice I'm calling Log/Crash reporting a lot more than I initially thought, in some cases multiple calls at the same time and perhaps MAUI doesn't "like that" when retrieving device information.

A big clue that brought me to thinking this is that in Xamarin Forms apps in the past I also had issues with calling the Xamarin Forms Essentials Device Info's and had to also use a Lock for those to control access to avoid "random crashes".

So, I'm not sure there's anything Sentry can do (besides controlling access with a lock of course), perhaps we should just close the issue and keep it here a "cautionary tale" for other developers that have similar problems?

jamescrosswell commented 3 days ago

That said, I have continued the investigation and found out that if I control the access to SentrySdk.CaptureException by using a Monitor/Lock I don't get the crash anymore. Example below:

Maybe worth trying to recreate this. I think it'd be possible to spawn a bunch of threads, each running tight loops calling _loggerService.Log, without any locking/monitors or anything. That should cause the error to be thrown almost immediately... and if we can do that, we should be able to either find the root cause and/or a workaround.