streetcomplete / StreetComplete

Easy to use OpenStreetMap editor for Android
https://streetcomplete.app
GNU General Public License v3.0
3.9k stars 357 forks source link

Random crash on Android 12 (hardened_malloc on GrapheneOS is crashing with "detected write after free") #4277

Closed Altonss closed 2 years ago

Altonss commented 2 years ago

When navigating the map, the app crashes : it happened while on motorway and also while navigating on the map "far" from my current position (a few kilometers)

How to Reproduce Currently I don't know how to reproduce.

Expected Behavior Should not crash.

Versions affected Android 12 StreetComplete 45.1 from F-Droid

Altonss commented 2 years ago

I will wait for 45.2 to reach F-Droid and give you my feedback if it resolves the issue :)

westnordost commented 2 years ago

Did you send a crash report when the app asked you to?

Altonss commented 2 years ago

Did you send a crash report when the app asked you to?

The app did not asked me to, I just restarted the app and nothing showed up! Just also tested 45.2 but the crash still happens, and this time it's really regular :thinking:

westnordost commented 2 years ago

I can't think of any situation where the app would not ask you to send a crash report after it crashed if you got the app from F-Droid. It does not ask if you got the app from Google Play or if it is a development build. Maybe the app is "regularly" killed by the system (For whatever reason), so it does not count as a crash?

westnordost commented 2 years ago

Do you have an email app on your phone?

Altonss commented 2 years ago

Do you have an email app on your phone?

Yes I have K9-Mail on the phone

Altonss commented 2 years ago

Maybe the app is "regularly" killed by the system (For whatever reason), so it does not count as a crash?

The battery setting for this app was set to "Optimised", I just set it to "Without restrictions" and will try to see if it resolves the crash issue

Altonss commented 2 years ago

The battery setting for this app was set to "Optimised", I just set it to "Without restrictions" and will try to see if it resolves the crash issue

Still crashing, so it was not the issue here :)

westnordost commented 2 years ago

So 🤷 I don't know. Apparently, StreetComplete does not get notified that there is a crash, which is why it does not ask to send a crash report. It sounds like the crash happens outside of the app, maybe?

Do you have access to your system log? (adb logcat) Maybe the system log will reveal what is going wrong.

Altonss commented 2 years ago

Do you have access to your system log? (adb logcat) Maybe the system log will reveal what is going wrong.

I will try to do adb logcat as son as possible :)

Altonss commented 2 years ago

I managed to record the crash with adb logcat :) Which part is important for you (because it's pretty long, and haven't looked at it if there is any personnal data.

westnordost commented 2 years ago

The crash itself should not be too long. Everything of the strack trace would be nice, but what happened immediately before might be interesting too

Altonss commented 2 years ago
08-08 16:27:31.778 22772 22805 F hardened_malloc: fatal allocator error: detected write after free
08-08 16:27:31.778 22772 22805 F libc    : Fatal signal 6 (SIGABRT), code -1 (SI_QUEUE) in tid 22805 (DefaultDispatch), pid 22772 (.streetcomplete)

This is maybe related to the hardened_malloc in GrapheneOs...

matkoniecz commented 2 years ago

haven't looked at it if there is any personnal data.

You can send to email at https://github.com/streetcomplete/StreetComplete/blob/e34f3b5163d4c443c6436fddb531c1116c0529a7/app/src/main/java/de/westnordost/streetcomplete/ApplicationModule.kt#L17 if you prefer to not make it public

I think that the most private info is location, anyway revealed by your mapping

Altonss commented 2 years ago

haven't looked at it if there is any personnal data.

You can send to email at

https://github.com/streetcomplete/StreetComplete/blob/e34f3b5163d4c443c6436fddb531c1116c0529a7/app/src/main/java/de/westnordost/streetcomplete/ApplicationModule.kt#L17 if you prefer to not make it public

I think that the most private info is location, anyway revealed by your mapping

From what I understand the 2 lines I sent are a good start, looks to be a security feature implemented in hardened_malloc, securing for memory allocation errors...

westnordost commented 2 years ago

But it is not a stack trace. Is there no stack trace? That error just means that somewhere in native code, a process wasn't able to assign memory.

Altonss commented 2 years ago

But it is not a stack trace. Is there no stack trace? That error just means that somewhere in native code, a process wasn't able to assign memory.

08-08 16:27:32.440 22878 22878 F DEBUG   : pid: 22772, tid: 22805, name: DefaultDispatch  >>> de.westnordost.streetcomplete <<<
08-08 16:27:32.440 22878 22878 F DEBUG   : uid: 10135
08-08 16:27:32.440 22878 22878 F DEBUG   : signal 6 (SIGABRT), code -1 (SI_QUEUE), fault addr --------
08-08 16:27:32.440 22878 22878 F DEBUG   :     x0  0000000000000000  x1  0000000000005915  x2  0000000000000006  x3  0000ca63833baa20
08-08 16:27:32.440 22878 22878 F DEBUG   :     x4  1f63647362647364  x5  1f63647362647364  x6  1f63647362647364  x7  7f7f7f7f7f7f7f7f
08-08 16:27:32.440 22878 22878 F DEBUG   :     x8  00000000000000f0  x9  0000ca96b3125a10  x10 0000000000000000  x11 ffffff80fffffbdf
08-08 16:27:32.440 22878 22878 F DEBUG   :     x12 0000000000000001  x13 0000000100000000  x14 0000ca63833ba630  x15 0000000000000000
08-08 16:27:32.440 22878 22878 F DEBUG   :     x16 0000ca96b31c0158  x17 0000ca96b319d470  x18 0000ca6382728000  x19 00000000000058f4
08-08 16:27:32.440 22878 22878 F DEBUG   :     x20 0000000000005915  x21 00000000ffffffff  x22 0000ca6adcf765b0  x23 0000ca9534dd3200
08-08 16:27:32.440 22878 22878 F DEBUG   :     x24 b400ca67ebe19380  x25 0000000000000070  x26 0000000000000007  x27 0000ca951f6a0000
08-08 16:27:32.440 22878 22878 F DEBUG   :     x28 0000ca951f7a1550  x29 0000ca63833baaa0
08-08 16:27:32.440 22878 22878 F DEBUG   :     lr  0000ca96b3150288  sp  0000ca63833baa00  pc  0000ca96b31502b8  pst 0000000000000000
08-08 16:27:32.440 22878 22878 F DEBUG   : backtrace:
08-08 16:27:32.440 22878 22878 F DEBUG   :       #00 pc 000000000004c2b8  /apex/com.android.runtime/lib64/bionic/libc.so (abort+168) (BuildId: 761d634420410980165d18a838ce8c70)
08-08 16:27:32.440 22878 22878 F DEBUG   :       #01 pc 0000000000043394  /apex/com.android.runtime/lib64/bionic/libc.so (fatal_error+112) (BuildId: 761d634420410980165d18a838ce8c70)
08-08 16:27:32.440 22878 22878 F DEBUG   :       #02 pc 00000000000405bc  /apex/com.android.runtime/lib64/bionic/libc.so (allocate+2116) (BuildId: 761d634420410980165d18a838ce8c70)
08-08 16:27:32.440 22878 22878 F DEBUG   :       #03 pc 000000000003bc30  /apex/com.android.runtime/lib64/bionic/libc.so (malloc+36) (BuildId: 761d634420410980165d18a838ce8c70)
08-08 16:27:32.440 22878 22878 F DEBUG   :       #04 pc 000000000004cb8c  /system/lib64/libc++.so (operator new(unsigned long)+24) (BuildId: 8a3045534e859293b5fb63562d4811eb)
08-08 16:27:32.440 22878 22878 F DEBUG   :       #05 pc 00000000000979a8  /system/lib64/libc++.so (std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >::__grow_by(unsigned long, unsigned long, unsigned long, unsigned long, unsigned long, unsigned long)+152) (BuildId: 8a3045534e859293b5fb63562d4811eb)
08-08 16:27:32.440 22878 22878 F DEBUG   :       #06 pc 0000000000097a88  /system/lib64/libc++.so (std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >::push_back(char)+100) (BuildId: 8a3045534e859293b5fb63562d4811eb)
08-08 16:27:32.440 22878 22878 F DEBUG   :       #07 pc 000000000028c818  /apex/com.android.art/lib64/libart.so (std::__1::basic_stringbuf<char, std::__1::char_traits<char>, std::__1::allocator<char> >::overflow(int)+108) (BuildId: 82b8f687190ca282ccd47876ca6c57b9)
08-08 16:27:32.440 22878 22878 F DEBUG   :       #08 pc 000000000005ae0c  /system/lib64/libc++.so (std::__1::basic_streambuf<char, std::__1::char_traits<char> >::xsputn(char const*, long)+140) (BuildId: 8a3045534e859293b5fb63562d4811eb)
08-08 16:27:32.440 22878 22878 F DEBUG   :       #09 pc 000000000027c4d8  /apex/com.android.art/lib64/libart.so (std::__1::ostreambuf_iterator<char, std::__1::char_traits<char> > std::__1::__pad_and_output<char, std::__1::char_traits<char> >(std::__1::ostreambuf_iterator<char, std::__1::char_traits<char> >, char const*, char const*, char const*, std::__1::ios_base&, char)+328) (BuildId: 82b8f687190ca282ccd47876ca6c57b9)
08-08 16:27:32.440 22878 22878 F DEBUG   :       #10 pc 000000000027c330  /apex/com.android.art/lib64/libart.so (std::__1::basic_ostream<char, std::__1::char_traits<char> >& std::__1::__put_character_sequence<char, std::__1::char_traits<char> >(std::__1::basic_ostream<char, std::__1::char_traits<char> >&, char const*, unsigned long)+212) (BuildId: 82b8f687190ca282ccd47876ca6c57b9)
08-08 16:27:32.441 22878 22878 F DEBUG   :       #11 pc 00000000002e98d8  /apex/com.android.art/lib64/libart.so (art::AddReferrerLocation(std::__1::basic_ostream<char, std::__1::char_traits<char> >&, art::ObjPtr<art::mirror::Class>)+108) (BuildId: 82b8f687190ca282ccd47876ca6c57b9)
08-08 16:27:32.441 22878 22878 F DEBUG   :       #12 pc 00000000002e4a38  /apex/com.android.art/lib64/libart.so (art::ThrowException(char const*, art::ObjPtr<art::mirror::Class>, char const*, std::__va_list*) (.llvm.16620363683444992276)+388) (BuildId: 82b8f687190ca282ccd47876ca6c57b9)
08-08 16:27:32.441 22878 22878 F DEBUG   :       #13 pc 00000000002e785c  /apex/com.android.art/lib64/libart.so (art::ThrowNoSuchFieldException(art::ObjPtr<art::mirror::Class>, std::__1::basic_string_view<char, std::__1::char_traits<char> >)+340) (BuildId: 82b8f687190ca282ccd47876ca6c57b9)
08-08 16:27:32.441 22878 22878 F DEBUG   :       #14 pc 0000000000570188  /apex/com.android.art/lib64/libart.so (art::Class_getDeclaredField(_JNIEnv*, _jobject*, _jstring*)+2124) (BuildId: 82b8f687190ca282ccd47876ca6c57b9)
08-08 16:27:32.441 22878 22878 F DEBUG   :       #15 pc 00000000000b1810  /apex/com.android.art/javalib/arm64/boot.oat (art_jni_trampoline+112) (BuildId: 165623f6d01b95c1fa9ab0d0ac3f13f5e28dc768)
08-08 16:27:32.441 22878 22878 F DEBUG   :       #16 pc 00000000008b0cf8  /data/app/~~lkH7g7HFDf_wipQGQqv1sg==/de.westnordost.streetcomplete-E8xE1EGkgLXi_h-EThCPkw==/oat/arm64/base.odex (kotlinx.serialization.internal.PlatformKt.companionOrNull+72)
08-08 16:27:32.441 22878 22878 F DEBUG   :       #17 pc 00000000008b2cc8  /data/app/~~lkH7g7HFDf_wipQGQqv1sg==/de.westnordost.streetcomplete-E8xE1EGkgLXi_h-EThCPkw==/oat/arm64/base.odex (kotlinx.serialization.internal.PlatformKt.invokeSerializerOnCompanion+56)
08-08 16:27:32.441 22878 22878 F DEBUG   :       #18 pc 00000000008b11c8  /data/app/~~lkH7g7HFDf_wipQGQqv1sg==/de.westnordost.streetcomplete-E8xE1EGkgLXi_h-EThCPkw==/oat/arm64/base.odex (kotlinx.serialization.internal.PlatformKt.constructSerializerForGivenTypeArgs+440)
08-08 16:27:32.441 22878 22878 F DEBUG   :       #19 pc 00000000008aa488  /data/app/~~lkH7g7HFDf_wipQGQqv1sg==/de.westnordost.streetcomplete-E8xE1EGkgLXi_h-EThCPkw==/oat/arm64/base.odex (kotlinx.serialization.SerializersKt__SerializersKt.serializerByKTypeImpl$SerializersKt__SerializersKt+1304)
08-08 16:27:32.441 22878 22878 F DEBUG   :       #20 pc 00000000008a75c8  /data/app/~~lkH7g7HFDf_wipQGQqv1sg==/de.westnordost.streetcomplete-E8xE1EGkgLXi_h-EThCPkw==/oat/arm64/base.odex (kotlinx.serialization.SerializersKt__SerializersKt.builtinSerializer$SerializersKt__SerializersKt+1128)
08-08 16:27:32.441 22878 22878 F DEBUG   :       #21 pc 00000000008aa2a0  /data/app/~~lkH7g7HFDf_wipQGQqv1sg==/de.westnordost.streetcomplete-E8xE1EGkgLXi_h-EThCPkw==/oat/arm64/base.odex (kotlinx.serialization.SerializersKt__SerializersKt.serializerByKTypeImpl$SerializersKt__SerializersKt+816)
08-08 16:27:32.441 22878 22878 F DEBUG   :       #22 pc 0000000000712b74  /data/app/~~lkH7g7HFDf_wipQGQqv1sg==/de.westnordost.streetcomplete-E8xE1EGkgLXi_h-EThCPkw==/oat/arm64/base.odex (de.westnordost.streetcomplete.data.osm.mapdata.NodeDaoKt.toNode+1332)
08-08 16:27:32.441 22878 22878 F DEBUG   :       #23 pc 0000000000ed09b8  /data/app/~~lkH7g7HFDf_wipQGQqv1sg==/de.westnordost.streetcomplete-E8xE1EGkgLXi_h-EThCPkw==/oat/arm64/base.odex (de.westnordost.streetcomplete.data.osm.mapdata.NodeDao$getAll$2.invoke+216)
08-08 16:27:32.441 22878 22878 F DEBUG   :       #24 pc 0000000000ba56e0  /data/app/~~lkH7g7HFDf_wipQGQqv1sg==/de.westnordost.streetcomplete-E8xE1EGkgLXi_h-EThCPkw==/oat/arm64/base.odex (de.westnordost.streetcomplete.data.AndroidDatabase.query+912)
08-08 16:27:32.441 22878 22878 F DEBUG   :       #25 pc 000000000069ea58  /data/app/~~lkH7g7HFDf_wipQGQqv1sg==/de.westnordost.streetcomplete-E8xE1EGkgLXi_h-EThCPkw==/oat/arm64/base.odex (de.westnordost.streetcomplete.data.Database$DefaultImpls.query$default+264)
08-08 16:27:32.441 22878 22878 F DEBUG   :       #26 pc 00000000006fb4bc  /data/app/~~lkH7g7HFDf_wipQGQqv1sg==/de.westnordost.streetcomplete-E8xE1EGkgLXi_h-EThCPkw==/oat/arm64/base.odex (de.westnordost.streetcomplete.data.osm.mapdata.ElementDao.getAll+380)
08-08 16:27:32.441 22878 22878 F DEBUG   :       #27 pc 00000000007085b8  /data/app/~~lkH7g7HFDf_wipQGQqv1sg==/de.westnordost.streetcomplete-E8xE1EGkgLXi_h-EThCPkw==/oat/arm64/base.odex (de.westnordost.streetcomplete.data.osm.mapdata.MapDataController.getMapDataWithGeometry+200)
08-08 16:27:32.441 22878 22878 F DEBUG   :       #28 pc 0000000000bc1144  /data/app/~~lkH7g7HFDf_wipQGQqv1sg==/de.westnordost.streetcomplete-E8xE1EGkgLXi_h-EThCPkw==/oat/arm64/base.odex (de.westnordost.streetcomplete.data.osm.edits.MapDataWithEditsSource.getMapDataWithGeometry+180)
08-08 16:27:32.441 22878 22878 F DEBUG   :       #29 pc 00000000011e349c  /data/app/~~lkH7g7HFDf_wipQGQqv1sg==/de.westnordost.streetcomplete-E8xE1EGkgLXi_h-EThCPkw==/oat/arm64/base.odex (de.westnordost.streetcomplete.screens.main.map.StyleableOverlayManager$onNewTilesRect$1$mapData$1.invokeSuspend+284)
08-08 16:27:32.441 22878 22878 F DEBUG   :       #30 pc 0000000000cc1ae4  /data/app/~~lkH7g7HFDf_wipQGQqv1sg==/de.westnordost.streetcomplete-E8xE1EGkgLXi_h-EThCPkw==/oat/arm64/base.odex (kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith+292)
08-08 16:27:32.441 22878 22878 F DEBUG   :       #31 pc 0000000000cca7d0  /data/app/~~lkH7g7HFDf_wipQGQqv1sg==/de.westnordost.streetcomplete-E8xE1EGkgLXi_h-EThCPkw==/oat/arm64/base.odex (kotlinx.coroutines.DispatchedTask.run+1824)
08-08 16:27:32.441 22878 22878 F DEBUG   :       #32 pc 00000000012426dc  /data/app/~~lkH7g7HFDf_wipQGQqv1sg==/de.westnordost.streetcomplete-E8xE1EGkgLXi_h-EThCPkw==/oat/arm64/base.odex (kotlinx.coroutines.internal.LimitedDispatcher.run+764)
08-08 16:27:32.441 22878 22878 F DEBUG   :       #33 pc 0000000000cd5e8c  /data/app/~~lkH7g7HFDf_wipQGQqv1sg==/de.westnordost.streetcomplete-E8xE1EGkgLXi_h-EThCPkw==/oat/arm64/base.odex (kotlinx.coroutines.scheduling.TaskImpl.run+76)
08-08 16:27:32.441 22878 22878 F DEBUG   :       #34 pc 00000000008a1984  /data/app/~~lkH7g7HFDf_wipQGQqv1sg==/de.westnordost.streetcomplete-E8xE1EGkgLXi_h-EThCPkw==/oat/arm64/base.odex (kotlinx.coroutines.scheduling.CoroutineScheduler.runSafely+68)
08-08 16:27:32.441 22878 22878 F DEBUG   :       #35 pc 000000000089c8a8  /data/app/~~lkH7g7HFDf_wipQGQqv1sg==/de.westnordost.streetcomplete-E8xE1EGkgLXi_h-EThCPkw==/oat/arm64/base.odex (kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.runWorker+1128)
08-08 16:27:32.441 22878 22878 F DEBUG   :       #36 pc 000000000089e0e8  /data/app/~~lkH7g7HFDf_wipQGQqv1sg==/de.westnordost.streetcomplete-E8xE1EGkgLXi_h-EThCPkw==/oat/arm64/base.odex (kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.run+40)
08-08 16:27:32.441 22878 22878 F DEBUG   :       #37 pc 0000000000218964  /apex/com.android.art/lib64/libart.so (art_quick_invoke_stub+548) (BuildId: 82b8f687190ca282ccd47876ca6c57b9)
08-08 16:27:32.441 22878 22878 F DEBUG   :       #38 pc 0000000000284080  /apex/com.android.art/lib64/libart.so (art::ArtMethod::Invoke(art::Thread*, unsigned int*, unsigned int, art::JValue*, char const*)+184) (BuildId: 82b8f687190ca282ccd47876ca6c57b9)
08-08 16:27:32.441 22878 22878 F DEBUG   :       #39 pc 000000000061776c  /apex/com.android.art/lib64/libart.so (art::JValue art::InvokeVirtualOrInterfaceWithJValues<art::ArtMethod*>(art::ScopedObjectAccessAlreadyRunnable const&, _jobject*, art::ArtMethod*, jvalue const*)+460) (BuildId: 82b8f687190ca282ccd47876ca6c57b9)
08-08 16:27:32.441 22878 22878 F DEBUG   :       #40 pc 0000000000665c88  /apex/com.android.art/lib64/libart.so (art::Thread::CreateCallback(void*)+1164) (BuildId: 82b8f687190ca282ccd47876ca6c57b9)
08-08 16:27:32.441 22878 22878 F DEBUG   :       #41 pc 00000000000ae080  /apex/com.android.runtime/lib64/bionic/libc.so (__pthread_start(void*)+204) (BuildId: 761d634420410980165d18a838ce8c70)
08-08 16:27:32.441 22878 22878 F DEBUG   :       #42 pc 000000000004da70  /apex/com.android.runtime/lib64/bionic/libc.so (__start_thread+64) (BuildId: 761d634420410980165d18a838ce8c70)

Is this what you are looking for?

westnordost commented 2 years ago

Yes, thank you! Looks on first glance like an issue outside of StreetComplete though. The last line of code executed is

https://github.com/streetcomplete/StreetComplete/blob/c2def4404b72c6edb855dd9fb7a0596d69674056/app/src/main/java/de/westnordost/streetcomplete/data/osm/mapdata/NodeDao.kt#L97

Or written in another way, more clearly:

Json.decodeFromString<Map<String, String>>(aString)

Then, it is off to the kotlinx-serialization library (that converts JSON strings to Java objects). I doubt though that the issue is in kotlinx-serialization. kotlinx-serialization does not have native code (for the code that runs on a JVM). Code that runs in the JVM cannot trigger any error in memory allocation.

To me, it looks like the error is in ART itself, which is a component of your Android system (The Android system's Java Virtual Machine on which all apps are executed). This would also explain why StreetComplete does not notice that it crashed. The container in which StreetComplete was executed crashed.

I suggest to report this to GrapheneOS developers or maybe first look if an update is available or the problem is already known.

Altonss commented 2 years ago

Thanks a lot for your explanation! But this is strange as it only happens on StreetComplete... :thinking:

I suggest to report this to GrapheneOS developers or maybe first look if an update is available or the problem is already known.

I already have the latest version installed, so I might open an issue there. So the affected component is the probably the ART of GrapheneOS?

matkoniecz commented 2 years ago

But this is strange as it only happens on StreetComplete...

Maybe it is more noticeable with SC and some other people complain about notification sometimes not working or about not getting notified about emails? Or something else just crashes and you have not investigated it.

Also, SC serializes/de-serializes more and more complex data than many apps.

Altonss commented 2 years ago

So the bug seems to be caused by StreetComplete itself: https://github.com/GrapheneOS/os-issue-tracker/issues/1362#issuecomment-1208269870

It is a "write after free bug" in the app, so something that needs to be fixed on the side of StreetComplete if I understood correctly :)

thestinger commented 2 years ago

This is a write-after-free or buffer overflow bug in your app. When an allocation is freed with hardened_malloc, the memory is zeroed. When allocations are created, hardened_malloc checks that the data is still zero to detect writes which occurred while the allocation was freed (write-after-free). It could be because of a use-after-free bug (most likely) or a buffer overflow from a nearby allocation. The traceback for the abort shows where the error was detected on allocation, not where it was freed. You need to use ASan and other tooling like that to debug the issue. It's almost certainly a memory corruption bug in the app code or library code used by it.

westnordost commented 2 years ago

Android applications run in ART, this is a JVM environment. Code that runs in the JVM cannot cause a memory (corruption) error.

matkoniecz commented 2 years ago

Is it possible that kotlinx.serialization.internal.PlatformKt has some buggy native code that crashes here?

Altonss commented 2 years ago

This is a write-after-free or buffer overflow bug in your app. When an allocation is freed with hardened_malloc, the memory is zeroed. When allocations are created, hardened_malloc checks that the data is still zero to detect writes which occurred while the allocation was freed (write-after-free). It could be because of a use-after-free bug (most likely) or a buffer overflow from a nearby allocation. The traceback for the abort shows where the error was detected on allocation, not where it was freed. You need to use ASan and other tooling like that to debug the issue. It's almost certainly a memory corruption bug in the app code or library code used by it.

Thanks a lot for the explanation :)

westnordost commented 2 years ago

Is it possible that kotlinx.serialization.internal.PlatformKt has some buggy native code that crashes here?

Possible. One think I did not mention in the comment I posted 2 minutes ago is that it is possible to add native code to Java libraries (via JNI), so any Java library could theoretically have some native components. StreetComplete does not, but perhaps kotlinx.serialization does.

thestinger commented 2 years ago

Android applications run in ART, this is a JVM environment. Code that runs in the JVM cannot cause a memory (corruption) error.

Java has an unsafe module and JNI for using C and C++ libraries. There are plenty of memory corruption bugs in Android apps via their C and C++ dependencies.

westnordost commented 2 years ago

@matkoniecz @thestinger doesn't look like the code in PlatformKt is accessing JNI/native code though:

https://github.com/Kotlin/kotlinx.serialization/blob/16a85df254f4f1e317554eb61ee1fbe914800aa4/core/jvmMain/src/kotlinx/serialization/internal/Platform.kt#L118-L125

Just some reflection.

thestinger commented 2 years ago

The traceback shows what triggered the allocation where the write after free was detected. It's not what freed the memory and what wrote to the memory after it was freed. There's no particular reason to assume it has anything to do with that library.

thestinger commented 2 years ago

This is a write-after-free or buffer overflow bug in your app. When an allocation is freed with hardened_malloc, the memory is zeroed. When allocations are created, hardened_malloc checks that the data is still zero to detect writes which occurred while the allocation was freed (write-after-free). It could be because of a use-after-free bug (most likely) or a buffer overflow from a nearby allocation. The traceback for the abort shows where the error was detected on allocation, not where it was freed. You need to use ASan and other tooling like that to debug the issue. It's almost certainly a memory corruption bug in the app code or library code used by it.

Look at what I wrote above. The traceback is where memory was allocated where hardened_malloc found that something had written to memory that wasn't allocated. What likely happened is that something freed an allocation and then wrote to it after free. The traceback showing where it was detecting doesn't have a particular reason to be related. It's the same kind of thing as detecting that something overwrote a random canary. The place that detects it is just where the check was run, not what caused the memory corruption.

thestinger commented 2 years ago

There are debugging tools like Valgrind which can likely detect the problem and provide a useful traceback. The hardened_malloc traceback isn't useful for debugging this. It just shows that it found a write-after-free, not what did it.

westnordost commented 2 years ago

But write-after-free of what?

westnordost commented 2 years ago

Ah, I understand now. So, the traceback is really useless for that matter.

westnordost commented 2 years ago

Thank you, @thestinger

thestinger commented 2 years ago

You could try using https://developer.android.com/ndk/guides/wrap-script with https://android.googlesource.com/platform/bionic/+/master/libc/malloc_debug/README.md. The free_track feature is similar to the hardened_malloc feature but is designed for debugging and should show where the free occurred which is almost always the code that's at fault because it likely used it after it freed it.

westnordost commented 2 years ago

So, as far as I know, there is only one library included in StreetComplete that uses JNI to access a native C++-library and that library is tangram-es.

Which is a bit of a problem, because the development of tangram-es is effectively pretty much dead.

I reported 5 individual crashes in February (https://github.com/tangrams/tangram-es/issues/2315) and nothing was investigated or even provided debug symbols to get proper stack traces in the native code. I think it makes no sense to post a ticket for the suspicion of a memory allocation error over at tangram-es. An I am too little accustomed to C++ to just (fork and) look for the issues myself in that library.

@thestinger thank you for the link. It reads though as if this is a tool for developers, i.e. adding this wrap.sh will only affect debug builds. This would necessitate me being able to reproduce it, though I haven't had the issue as reported yet (maybe because I am not using Android 12 yet).

thestinger commented 2 years ago

@westnordost It likely crashes outside GrapheneOS but only very rarely. GrapheneOS is detecting it as part of a set of security features to detect exploitable use-after-free vulnerabilities. Without the code for detecting it, the write after free would usually not be noticed unless something allocates memory before the write occurs in which case something being used would be overwritten.

thestinger commented 2 years ago

The crashes you're reporting are likely the same set of bugs. hardened_malloc is just detecting it and aborting instead of having an unexplained crash. hardened_malloc quarantines memory after it gets freed rather than allowing it to be used right away, and then checks to see if anything wrote to it after free. That's why it's a safe abort instead of some unexplained memory corruption crash.

If you use malloc_debug, Valgrind or ASan, you can likely find the actual sources of these bugs. It probably is caused by that library. I see other assorted crashes reported.

westnordost commented 2 years ago

Makes sense from a security perspective. Every native library with unsolved memory allocation errors is a security risk. (I understand Rust solves this, right? I.e. it is not possible to write code that will produce these kinds of errors)

matkoniecz commented 2 years ago

Rust solves this, right?

Rust makes easier to solve this, but still allows unsafe operations. See https://doc.rust-lang.org/nomicon/meet-safe-and-unsafe.html

For example you can call C code from Rust. And the same applies to all libraries one uses and so on.

thestinger commented 2 years ago

I understand Rust solves this, right? I.e. it is not possible to write code that will produce these kinds of errors

Yeah, within the normal safe subset of Rust, you can't trigger memory corruption despite it not having a garbage collector. It formalizes the usual rules about ownership / lifetimes in C in a strict way that prevents both bounds (spatial) and lifetime (temporal) issues. It mostly does that with static checks at compile-time based on the type system which tracks lifetime scopes as subtypes and enforces rules on reference aliasing, etc. so the rug can't be pulled out from underneath something by one reference changing / removing something another is using, etc.

westnordost commented 2 years ago

For example you can call C code from Rust. And the same applies to all libraries one uses and so on.

Yeah, but true for Java as well, so...

One more language to learn on my list.

westnordost commented 2 years ago

Actually, I will close this ticket, as it does not add any useful information:

We know there are crashes in native code and we know where they come from (tangram-es, at least). Only after some/all of the native crashes caused by tangram-es are fixed it makes sense to revisit this - if then the issue would still be reproducible.

Altonss commented 2 years ago

Yes the crashes are probably due to tangram-es :/

thestinger commented 2 years ago

Next release of GrapheneOS will have a per-app toggle for using 39-bit address space and Scudo (standard Android malloc) instead of 48-bit address space and hardened_malloc.

 <string name="app_relax_hardening_title">Exploit protection compatibility mode</string>
 <string name="app_relax_hardening_summary">Improve compatibility with misbehaving apps by using Android\'s standard address space size and memory allocator</string>

If you get further reports about it you can tell users to enable that and it should work.

westnordost commented 2 years ago

Thanks for the info!

cobalt32 commented 1 year ago

Sorry for the necropost, but enabling Exploit protection compatibility mode for StreetComplete in settings "fixes" the crashing problem on GrapheneOS.