dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
14.91k stars 4.63k forks source link

Random app crashes - SIGABRT in SGen Worker #100311

Open pijnappel opened 5 months ago

pijnappel commented 5 months ago

Description

My app is processing a long running loop generating data and storing it into files. After several hours of running, the app closes and the logcat contains the crash report below:

`03-19 12:59:58.679 2156 2175 E : How can an object and a reference inside it not be in the same block? 03-19 12:59:58.679 2156 2175 F libc : Fatal signal 6 (SIGABRT), code -6 (SI_TKILL) in tid 2175 (SGen worker), pid 2156 (yApp.mobile.xrf)

These random crashes (SIGABRT here, SIGSEGV in a different issue) really prevent us from releasing it for production. We're open to any suggestions on debugging and preventing them. We've now activated further logging of Mono and will add these logs when available.

Steps to Reproduce

We are unable to reproduce this. It just happens randomly

Link to public reproduction project repository

No response

Version with bug

8.0.7 SR2

Is this a regression from previous behavior?

Not sure, did not test other versions

Last version that worked well

Unknown/Other

Affected platforms

Android

Affected platform versions

Android 9

Did you find any workaround?

No response

Relevant log output

03-19 12:59:58.679  2156  2175 E         : How can an object and a reference inside it not be in the same block?
03-19 12:59:58.679  2156  2175 F libc    : Fatal signal 6 (SIGABRT), code -6 (SI_TKILL) in tid 2175 (SGen worker), pid 2156 (yApp.mobile.xrf)
03-19 12:59:59.003 11233 11233 I crash_dump64: obtaining output fd from tombstoned, type: kDebuggerdTombstone
03-19 12:59:59.007  3029  3029 I /system/bin/tombstoned: received crash request for pid 2175
03-19 12:59:59.009 11233 11233 I crash_dump64: performing dump of process 2156 (target tid = 2175)
03-19 12:59:59.030 11233 11233 F DEBUG   : *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
03-19 12:59:59.031 11233 11233 F DEBUG   : Build fingerprint: 'Android/aek/aek:9/2.3.4-ga-rc2/root01172037:userdebug/release-keys'
03-19 12:59:59.063 11233 11233 I crash_dump64: type=1400 audit(0.0:5140): avc: denied { read } for name="com.google.android.datatransport.events-shm" dev="mmcblk0p13" ino=917935 scontext=u:r:crash_dump:s0:c110,c256,c512,c768 tcontext=u:object_r:app_data_file:s0:c110,c256,c512,c768 tclass=file permissive=1
03-19 12:59:59.031 11233 11233 F DEBUG   : Revision: '0'
03-19 12:59:59.031 11233 11233 F DEBUG   : ABI: 'arm64'
03-19 12:59:59.031 11233 11233 F DEBUG   : pid: 2156, tid: 2175, name: SGen worker  >>> com.myApp.mobile.xrf <<<
03-19 12:59:59.031 11233 11233 F DEBUG   : signal 6 (SIGABRT), code -6 (SI_TKILL), fault addr --------
03-19 12:59:59.031 11233 11233 F DEBUG   :     x0  0000000000000000  x1  000000000000087f  x2  0000000000000006  x3  0000000000000008
03-19 12:59:59.031 11233 11233 F DEBUG   :     x4  8000000000000000  x5  8000000000000000  x6  8000000000000000  x7  0000000000000080
03-19 12:59:59.031 11233 11233 F DEBUG   :     x8  0000000000000083  x9  0000e9bc6f0f89a0  x10 fffffff87ffffbdf  x11 0000000000000001
03-19 12:59:59.031 11233 11233 F DEBUG   :     x12 0000e9bc62cdd730  x13 0000000000000038  x14 ffffffffffffffff  x15 000038b594000000
03-19 12:59:59.031 11233 11233 F DEBUG   :     x16 0000e9bc6f1312c8  x17 0000e9bc6f06f2d8  x18 0000e9bc3ca8e54a  x19 000000000000086c
03-19 12:59:59.031 11233 11233 F DEBUG   :     x20 000000000000087f  x21 0000000000000083  x22 0000e9bc3c7fdf98  x23 0000000000000010
03-19 12:59:59.031 11233 11233 F DEBUG   :     x24 0000e9bc4f0ae1d0  x25 ffffffffffffffff  x26 0000e9bc4f0ae180  x27 0000e9bc4f0ad000
03-19 12:59:59.031 11233 11233 F DEBUG   :     x28 0000e9bc4f0ad000  x29 0000e9bc3ca8ec20
03-19 12:59:59.031 11233 11233 F DEBUG   :     sp  0000e9bc3ca8ebe0  lr  0000e9bc6f063a90  pc  0000e9bc6f063abc
03-19 12:59:59.035 11233 11233 F DEBUG   : 
03-19 12:59:59.035 11233 11233 F DEBUG   : backtrace:
03-19 12:59:59.035 11233 11233 F DEBUG   :     #00 pc 0000000000021abc  /system/lib64/libc.so (abort+124)
03-19 12:59:59.035 11233 11233 F DEBUG   :     dotnet/maui#1 pc 000000000001f360  /data/app/com.myApp.mobile.xrf-agVQeWGC0HQYuIfiW60gGg==/split_config.arm64_v8a.apk (offset 0xd38000) (xamarin::android::Helpers::abort_application()+8)
03-19 12:59:59.035 11233 11233 F DEBUG   :     dotnet/maui#2 pc 0000000000035660  /data/app/com.myApp.mobile.xrf-agVQeWGC0HQYuIfiW60gGg==/split_config.arm64_v8a.apk (offset 0xd38000) (xamarin::android::internal::MonodroidRuntime::mono_log_handler(char const*, char const*, char const*, int, void*)+144)
03-19 12:59:59.035 11233 11233 F DEBUG   :     dotnet/maui#3 pc 00000000001d71e4  /data/app/com.myApp.mobile.xrf-agVQeWGC0HQYuIfiW60gGg==/split_config.arm64_v8a.apk (offset 0xd8a000)
03-19 12:59:59.035 11233 11233 F DEBUG   :     dotnet/maui#4 pc 00000000001d726c  /data/app/com.myApp.mobile.xrf-agVQeWGC0HQYuIfiW60gGg==/split_config.arm64_v8a.apk (offset 0xd8a000)
03-19 12:59:59.035 11233 11233 F DEBUG   :     dotnet/maui#5 pc 00000000002ec704  /data/app/com.myApp.mobile.xrf-agVQeWGC0HQYuIfiW60gGg==/split_config.arm64_v8a.apk (offset 0xd8a000)
03-19 12:59:59.035 11233 11233 F DEBUG   :     dotnet/maui#6 pc 00000000002dca54  /data/app/com.myApp.mobile.xrf-agVQeWGC0HQYuIfiW60gGg==/split_config.arm64_v8a.apk (offset 0xd8a000)
03-19 12:59:59.035 11233 11233 F DEBUG   :     dotnet/maui#7 pc 00000000002a70d0  /data/app/com.myApp.mobile.xrf-agVQeWGC0HQYuIfiW60gGg==/split_config.arm64_v8a.apk (offset 0xd8a000)
03-19 12:59:59.035 11233 11233 F DEBUG   :     dotnet/maui#8 pc 00000000002bf9b4  /data/app/com.myApp.mobile.xrf-agVQeWGC0HQYuIfiW60gGg==/split_config.arm64_v8a.apk (offset 0xd8a000)
03-19 12:59:59.035 11233 11233 F DEBUG   :     dotnet/maui#9 pc 00000000002d2984  /data/app/com.myApp.mobile.xrf-agVQeWGC0HQYuIfiW60gGg==/split_config.arm64_v8a.apk (offset 0xd8a000)
03-19 12:59:59.035 11233 11233 F DEBUG   :     dotnet/maui#10 pc 00000000002c85fc  /data/app/com.myApp.mobile.xrf-agVQeWGC0HQYuIfiW60gGg==/split_config.arm64_v8a.apk (offset 0xd8a000)
03-19 12:59:59.035 11233 11233 F DEBUG   :     dotnet/maui#11 pc 00000000002fafc4  /data/app/com.myApp.mobile.xrf-agVQeWGC0HQYuIfiW60gGg==/split_config.arm64_v8a.apk (offset 0xd8a000)
03-19 12:59:59.035 11233 11233 F DEBUG   :     dotnet/maui#12 pc 0000000000083114  /system/lib64/libc.so (__pthread_start(void*)+36)
03-19 12:59:59.035 11233 11233 F DEBUG   :     dotnet/maui#13 pc 00000000000233bc  /system/lib64/libc.so (__start_thread+68)
PureWeen commented 5 months ago

@jonathanpeppers thoughts?

pijnappel commented 5 months ago

Any suggestions on how to track this down? I have no idea where to start looking.

jonathanpeppers commented 5 months ago

If you can reproduce the crash with extra logging on.

adb shell setprop debug.mono.log default,assembly,mono_log_level=debug,mono_log_mask=all

That might give more information.

As the message mentions the Mono GC (SGen worker), we can move this to dotnet/runtime for the right folks to look at this.

pijnappel commented 5 months ago

I was able to reproduce the crash with extended Mono logging. Here is the full logcat:

SIGABRT.log

am11 commented 5 months ago

Stacktrace looks similar to the one in https://github.com/dotnet/runtime/issues/96804. Its fix is in latest 8.0.3 runtime release (not sure how it corresponds to 8.0.7 SR2 series in the top comment). cc @lambdageek

lambdageek commented 5 months ago

@am11 this is not related to #96804. that had an assertion failure. this one doesn't (at least I can't find one in SIGABRT.log).

all android crash logs look basically like this without symbols.

this isn't really actionable without symbolication @jonathanpeppers

jonathanpeppers commented 5 months ago

@pijnappel do you know what version of the runtime you are using? 8.0.2? We could probably use dotnet-symbols and ndk-stack to get line numbers.

pijnappel commented 4 months ago

I'm not really sure. I'm usually up to date and install all updates. How do I get the runtime version used by my app? I cannot reproduce the error when compiled in Debug. Is there any doc I can follow to get symbols and ndk-stack?

am11 commented 4 months ago

Something like dotnet tool install --global dotnet-symbol, then download symbols for all installed libs: sudo dotnet symbol --recurse-subdirectories --timeout 20 --symbols '/usr/share/dotnet/*.so', then download for your app: dotnet symbol--timeout 20 --symbols path/to/your-app-binary (which may or may not fail depending on the form factor). Finally, run the app under debugger (like lldb or gdb).