realm / realm-java

Realm is a mobile database: a replacement for SQLite & ORMs
http://realm.io
Apache License 2.0
11.45k stars 1.75k forks source link

Fatal signal 11 (SIGSEGV) from Java_io_realm_internal_UncheckedRow_nativeGetString #6152

Closed bfichter closed 4 years ago

bfichter commented 6 years ago

Goal

No crashes

Expected Results

No crashes

Actual Results

Crashing consistently for one affected user w/a seemingly corrupted DB state

A/DEBUG: *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
Build fingerprint: 'samsung/dreamqltesq/dreamqltesq:8.0.0/R16NW/G950USQU5CRG3:user/release-keys'
Revision: '12'
ABI: 'arm64'
A/DEBUG: pid: 27327, tid: 27372, name: RxComputationTh  >>> com.preveil.preveil <<<
signal 11 (SIGSEGV), code 2 (SEGV_ACCERR), fault addr 0x78713fe000
x0   00000078704e4380  x1   00000078704e437f  x2   0000007871254317  x3   00000078704e43d8
x4   00000078704e4380  x5   000000786f5a9d7d  x6   0000000000000000  x7   0000000000000000
x8   0000000000000000  x9   0000000000000000  x10  0000000000000001  x11  0000000000000000
x12  000000786fc05210  x13  0000000001000000  x14  0000000000000000  x15  0000000000000000
x16  000000787380e570  x17  000000788e181970  x18  0000000000000020  x19  00000078704e4380
x20  000000787376a000  x21  00000078704e4540  x22  00000078713fe000  x23  00000078704e4380
x24  00000078704e437f  x25  0000000000000001  x26  00000078704e4568  x27  00000078704e45d0
x28  00000078704e4570  x29  00000078704e42f0  x30  00000078735424e0
sp   00000078704e42f0  pc   00000078735424cc  pstate 0000000020000000
09-04 11:23:56.963 27385-27385/? A/DEBUG: backtrace:
#00 pc 000000000003b4cc  /data/app/com.preveil.preveil-FG01oMB2aWtfSFb4Aipq1w==/lib/arm64/librealm-jni.so
#01 pc 00000000000be5d8  /data/app/com.preveil.preveil-FG01oMB2aWtfSFb4Aipq1w==/lib/arm64/librealm-jni.so
#02 pc 00000000000b6f28  /data/app/com.preveil.preveil-FG01oMB2aWtfSFb4Aipq1w==/lib/arm64/librealm-jni.so (Java_io_realm_internal_UncheckedRow_nativeGetString+92)
#03 pc 0000000000510d00  /system/lib64/libart.so (art_quick_generic_jni_trampoline+144)
#04 pc 000000000000f8bc  /dev/ashmem/dalvik-jit-code-cache_27327_27327 (deleted)

Steps & Code to Reproduce

So far, only one known user has encountered this issue. This user will encounter the crash every time they launch the app. Fortunately, I have access to the user's device and have hooked it up to the debugger. There seems to be 3 RealmObjects (all of the same ChildObject type described below) out of hundreds which have somehow corrupted, and trying to access any of these 3 objects will seg fault. I've tried accessing these objects w/in a DynamicRealm, but that seg faults as well.

Although the stacktrace above happens on a RxComputation thread, when I run everything on the main thread, the crash persists.

Code Sample

Unfortunately I can't share specific code or realm files, but I'll describe the relevant schema structure and access which is causing the crash.

open class ParentObject : RealmObject() {
    @PrimaryKey
    var identifier = UUID.randomUUID().toString()
    var children = RealmList<ChildObject>()
    // other properties
}

open class ChildObject : RealmObject() {
    @PrimaryKey
    var identifier = UUID.randomUUID().toString()
    // other properties
}

// Elsewhere, on app launch
val parents = Realm.getDefaultInstance().where(ParentObject::class.java).findAll()
parents.forEach { parentObject ->
    parentObject.children.forEach { childObject ->
        // For hundreds of ChildObjects, this is totally fine
        // But for 3 seemingly corrupted objects, this seg faults
        val property = childObject.property
    }
}

Version of Realm and tooling

Realm version(s): 5.3.1 w/encryption enabled

Realm sync feature enabled: no

Android Studio version: 3.1.3

Which Android version and device: Samsung Galaxy S8 running Android 8

Zey-Uzh commented 5 years ago

I have an absolutely similar problem. The error rate increased when I changed the Realm version from 5.3.1 to 5.4.2 During the last day I received 661 crash reports from Google.

09-12 17:35:47.396 20978-20963/? A/google-breakpad: Microdump skipped (uninteresting)
09-12 17:35:47.424 20776-20963/? W/google-breakpad: ### ### ### ### ### ### ### ### ### ### ### ### ###
    Chrome build fingerprint:
    69.0.3497.91
    349709152
    ### ### ### ### ### ### ### ### ### ### ### ### ###
09-12 17:35:47.428 20776-20963/? A/libc: Fatal signal 11 (SIGSEGV), code 2, fault addr 0x7f4fafb000 in tid 20963 (RxAndroidHandle)
09-12 17:35:47.525 20979-20979/? A/DEBUG: *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
    Build fingerprint: 'Xiaomi/rosy_ru/rosy:7.1.2/N2G47H/V9.6.3.0.NDARUFD:user/release-keys'
09-12 17:35:47.526 20979-20979/? A/DEBUG: Revision: '0'
    ABI: 'arm64'
    pid: 20776, tid: 20963, name: RxAndroidHandle  >>> ru.fcb.mobilefieldcollection <<<
    signal 11 (SIGSEGV), code 2 (SEGV_ACCERR), fault addr 0x7f4fafb000
        x0   0000007f4fbfa1c0  x1   0000007f4fbfa1bf  x2   0000007f4f0817d7  x3   0000007f4fbfa218
        x4   0000007f4ce798bd  x5   0000007f811fd940  x6   0000007f4ce798bd  x7   0000007f811fd804
        x8   0000000000000000  x9   0000000000000001  x10  0000000000000001  x11  0000000000000000
        x12  0000000000000001  x13  00000000ffffffff  x14  000000000011e580  x15  0000000000001fe3
        x16  0000007f56e08570  x17  0000007f81198b48  x18  00000000ac2a017b  x19  0000007f4fbfa1c0
        x20  0000007f56d63000  x21  0000007f4fbfa380  x22  0000007f4fafb000  x23  0000007f4fbfa1c0
        x24  0000007f4fbfa1bf  x25  0000000000000001  x26  0000007f4fbfa3a8  x27  0000007f4fbfa410
09-12 17:35:47.527 20979-20979/? A/DEBUG:     x28  0000007f4fbfa3b0  x29  0000007f4fbfa130  x30  0000007f56b3cdf0
        sp   0000007f4fbfa130  pc   0000007f56b3cddc  pstate 0000000020000000
09-12 17:35:47.540 20979-20979/? A/DEBUG: backtrace:
        #00 pc 000000000003bddc  /data/app/ru.fcb.mobilefieldcollection-2/lib/arm64/librealm-jni.so
        #01 pc 00000000000bcc68  /data/app/ru.fcb.mobilefieldcollection-2/lib/arm64/librealm-jni.so
        #02 pc 00000000000b65b8  /data/app/ru.fcb.mobilefieldcollection-2/lib/arm64/librealm-jni.so (Java_io_realm_internal_UncheckedRow_nativeGetString+92)
        #03 pc 000000000072a9c0  /data/app/ru.fcb.mobilefieldcollection-2/oat/arm64/base.odex (offset 0x6b8000)

I was able to get a corrupted database: realm_crash_1.zip

My configuration:

new RealmConfiguration
                .Builder()
                .name(realmName)
                .schemaVersion(Constants.DB_VERSION)
                .directory(dbDir)
                .encryptionKey(key)
                .migration(MIGRATION)
                .initialData(INIT_REALM_TX)
                .build();

Version of Realm and tooling

Realm version(s): 5.4.2 w/encryption enabled

Realm sync feature enabled: no

Android Studio version: 3.1.4

Which Android version and device: Xiaomi Redmi 5

vladimirfx commented 5 years ago

Sometimes stack is slightly different: android_crush_code-6(SI_TKILL).log

Realm version: 5.5.0 w/encryption enabled Realm sync feature enabled: no

Android Studio version: 3.1.4

Which Android version and device: Emulator x86 API 28

mussa-ibragimov commented 5 years ago

This is happening on Samsung Galaxy S7 edge (hero2lte), Android 8.0

ggajews commented 5 years ago

Any updates on this? We have many reports on 5.7.1 with encryption enabled, but can't reproduce it locally.

vladimirfx commented 5 years ago

Realm version: 5.9.1 with encryption enabled Realm sync feature enabled: no

Android Studio version: 3.3.2

Which Android version and device: Xiaomi Redmi 5 (Android 7.1.2), Honor 10 (Android 9)

First crash: 03-13 14:25:36.592 8582-8894/ru.fcb.mobilefieldcollection A/libc: Fatal signal 6 (SIGABRT), code -6 in tid 8894 (RxAndroidHandle) Subsequent run: 03-13 14:32:23.447 12602-12645/ru.fcb.mobilefieldcollection A/libc: Fatal signal 11 (SIGSEGV), code 1, fault addr 0x7f5db44000 in tid 12645 (RxAndroidHandle) Or: 03-13 14:37:47.875 13034-13085/ru.fcb.mobilefieldcollection A/libc: Fatal signal 11 (SIGSEGV), code 2, fault addr 0x7f5cc44000 in tid 13085 (RxAndroidHandle)

Reproducible locally. It seems that some data blocks are corrupted, because crash is happening on the items list screen. When I've change filtering crash not happened. Same entries show on other view (map with markers) and there is no crash too but that view access only 3 numeric fields (id, lon, lat) from entity. Hope this helps. I can try with some debug/experimental realm build.

vladimirfx commented 5 years ago

While experimenting with different Realm versions I've noticed that 5.3.1 has minimum crash cases. We use 5.3.1 in in production hower even with 5.3.1 Realm crashes are top 7 and >90% of all crashes. Please assist.

Crashes_screen crash report :(

bmunkholm commented 5 years ago

@mussa-ibragimov @ggajews It would be useful to provide more information - e.g. a stacktrace so we can see if it's really the same issue. As it seems related to getting a string out of the database, any realm that has this issue, together with the models you use and information about the character sets that could be stored would be helpful.

bmunkholm commented 5 years ago

@Zey-Uzh While it's very useful to get the database that appears corrupt, we need the encryption key to actually open it :-)

ggajews commented 5 years ago

@bmunkholm I would love to provide more info, but all I have is the play store reports

#00  pc 000000000003d03c  /data/app/com.crashing.app-YaEGSwTeIOaFDQ6E-FZYQw==/lib/arm64/librealm-jni.so
#01  pc 00000000000bf300  /data/app/com.crashing.app-YaEGSwTeIOaFDQ6E-FZYQw==/lib/arm64/librealm-jni.so
#02  pc 00000000000b8298  /data/app/com.crashing.app-YaEGSwTeIOaFDQ6E-FZYQw==/lib/arm64/librealm-jni.so (Java_io_realm_internal_UncheckedRow_nativeGetString+92)
#03  pc 00000000007b09ac  /data/app/com.crashing.app-YaEGSwTeIOaFDQ6E-FZYQw==/oat/arm64/base.odex

We have trouble with reproducing it locally. We received 820 reports last week affecting 105 users Let me know if I can get you something more usable somehow

vladimirfx commented 5 years ago

@bmunkholm Where to send corrupted DB with key? As I know there is no way to change on disk character encoding for realm-java. Actual characters in our DB mostly from Cyrillic Unicode code range: https://jrgraphix.net/r/Unicode/0400-04FF I've try to populate DB with ASCII only characters...

cmelchior commented 5 years ago

@vladimirfx You can send it to help@realm.io

But having a DB that throws this error would be a massive help (and a pointer to which object actually throws the error).

vladimirfx commented 5 years ago

Corrupted DB emailed to help@realm.io

cmelchior commented 5 years ago

I assume this is the crash we should look for?

2018-09-13 13:56:25.398 22429-22429/? A/DEBUG: *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
2018-09-13 13:56:25.398 22429-22429/? A/DEBUG: Build fingerprint: 'google/sdk_gphone_x86/generic_x86:9/PSR1.180720.012/4923214:userdebug/dev-keys'
2018-09-13 13:56:25.398 22429-22429/? A/DEBUG: Revision: '0'
2018-09-13 13:56:25.398 22429-22429/? A/DEBUG: ABI: 'x86'
2018-09-13 13:56:25.398 22429-22429/? A/DEBUG: pid: 22074, tid: 22140, name: RxAndroidHandle  >>> ru.fcb.mobilefieldcollection <<<
2018-09-13 13:56:25.398 22429-22429/? A/DEBUG: signal 6 (SIGABRT), code -6 (SI_TKILL), fault addr --------
2018-09-13 13:56:25.398 22429-22429/? A/DEBUG:     eax 00000000  ebx 0000563a  ecx 0000567c  edx 00000006
2018-09-13 13:56:25.398 22429-22429/? A/DEBUG:     edi 0000563a  esi f50712a8
2018-09-13 13:56:25.398 22429-22429/? A/DEBUG:     ebp d1efa638  esp d1efa568  eip f6259b59
2018-09-13 13:56:25.509 2850-2974/system_process I/GnssLocationProvider: WakeLock acquired by sendMessage(REPORT_SV_STATUS, 0, com.android.server.location.GnssLocationProvider$SvStatusInfo@b5788a9)
2018-09-13 13:56:25.510 2850-2870/system_process I/GnssLocationProvider: WakeLock released by handleMessage(REPORT_SV_STATUS, 0, com.android.server.location.GnssLocationProvider$SvStatusInfo@b5788a9)
2018-09-13 13:56:25.629 22429-22429/? A/DEBUG: backtrace:
2018-09-13 13:56:25.629 22429-22429/? A/DEBUG:     #00 pc 00000b59  [vdso:f6259000] (__kernel_vsyscall+9)
2018-09-13 13:56:25.629 22429-22429/? A/DEBUG:     #01 pc 0001fdf8  /system/lib/libc.so (syscall+40)
2018-09-13 13:56:25.629 22429-22429/? A/DEBUG:     #02 pc 00022ed3  /system/lib/libc.so (abort+115)
2018-09-13 13:56:25.629 22429-22429/? A/DEBUG:     #03 pc 00260254  /data/app/ru.fcb.mobilefieldcollection-tiBKHhQFE9EmReHKcG6ROw==/lib/x86/librealm-jni.so (__gnu_cxx::__verbose_terminate_handler()+452)
2018-09-13 13:56:25.629 22429-22429/? A/DEBUG:     #04 pc 0022d967  /data/app/ru.fcb.mobilefieldcollection-tiBKHhQFE9EmReHKcG6ROw==/lib/x86/librealm-jni.so (__cxxabiv1::__terminate(void (*)())+23)
2018-09-13 13:56:25.629 22429-22429/? A/DEBUG:     #05 pc 0025fe85  /data/app/ru.fcb.mobilefieldcollection-tiBKHhQFE9EmReHKcG6ROw==/lib/x86/librealm-jni.so
2018-09-13 13:56:25.629 22429-22429/? A/DEBUG:     #06 pc 0022d0c1  /data/app/ru.fcb.mobilefieldcollection-tiBKHhQFE9EmReHKcG6ROw==/lib/x86/librealm-jni.so (__gxx_personality_v0+321)
2018-09-13 13:56:25.629 22429-22429/? A/DEBUG:     #07 pc 002741b8  /data/app/ru.fcb.mobilefieldcollection-tiBKHhQFE9EmReHKcG6ROw==/lib/x86/librealm-jni.so
2018-09-13 13:56:25.629 22429-22429/? A/DEBUG:     #08 pc 00274606  /data/app/ru.fcb.mobilefieldcollection-tiBKHhQFE9EmReHKcG6ROw==/lib/x86/librealm-jni.so (_Unwind_Resume+92)
2018-09-13 13:56:25.629 22429-22429/? A/DEBUG:     #09 pc 001183e0  /data/app/ru.fcb.mobilefieldcollection-tiBKHhQFE9EmReHKcG6ROw==/lib/x86/librealm-jni.so
2018-09-13 13:56:25.629 22429-22429/? A/DEBUG:     #10 pc 00118ac0  /data/app/ru.fcb.mobilefieldcollection-tiBKHhQFE9EmReHKcG6ROw==/lib/x86/librealm-jni.so
2018-09-13 13:56:25.629 22429-22429/? A/DEBUG:     #11 pc 00117661  /data/app/ru.fcb.mobilefieldcollection-tiBKHhQFE9EmReHKcG6ROw==/lib/x86/librealm-jni.so
2018-09-13 13:56:25.629 22429-22429/? A/DEBUG:     #12 pc 001ba3a8  /data/app/ru.fcb.mobilefieldcollection-tiBKHhQFE9EmReHKcG6ROw==/lib/x86/librealm-jni.so
2018-09-13 13:56:25.629 22429-22429/? A/DEBUG:     #13 pc 00225cac  /data/app/ru.fcb.mobilefieldcollection-tiBKHhQFE9EmReHKcG6ROw==/lib/x86/librealm-jni.so
2018-09-13 13:56:25.629 22429-22429/? A/DEBUG:     #14 pc 0020735a  /data/app/ru.fcb.mobilefieldcollection-tiBKHhQFE9EmReHKcG6ROw==/lib/x86/librealm-jni.so
2018-09-13 13:56:25.629 22429-22429/? A/DEBUG:     #15 pc 000a31e9  /data/app/ru.fcb.mobilefieldcollection-tiBKHhQFE9EmReHKcG6ROw==/lib/x86/librealm-jni.so (Java_io_realm_internal_UncheckedRow_nativeGetString+89)
2018-09-13 13:56:25.629 22429-22429/? A/DEBUG:     #16 pc 000f1110  /dev/ashmem/dalvik-jit-code-cache (deleted) (io.realm.internal.UncheckedRow.nativeGetString+144)
2018-09-13 13:56:25.629 22429-22429/? A/DEBUG:     #17 pc 000b4185  /dev/ashmem/dalvik-jit-code-cache (deleted) (io.realm.internal.UncheckedRow.getString+69)
2018-09-13 13:56:25.629 22429-22429/? A/DEBUG:     #18 pc 000dee34  /dev/ashmem/dalvik-jit-code-cache (deleted) (io.realm.ru_fcb_mobilefieldcollection_debt_DebtRealmProxy.realmGet$contragent+308)
2018-09-13 13:56:25.629 22429-22429/? A/DEBUG:     #19 pc 0003a4d8  /dev/ashmem/dalvik-jit-code-cache (deleted) (ru.fcb.mobilefieldcollection.debt.Debt.getContragent+216)
2018-09-13 13:56:25.629 22429-22429/? A/DEBUG:     #20 pc 000504d5  /dev/ashmem/dalvik-jit-code-cache (deleted) (ru.fcb.mobilefieldcollection.alldebts.AllDebtsPresenter.convertToUIDebt+725)
2018-09-13 13:56:25.629 22429-22429/? A/DEBUG:     #21 pc 0005afb6  /dev/ashmem/dalvik-jit-code-cache (deleted) (ru.fcb.mobilefieldcollection.alldebts.AllDebtsPresenter.lambda$bindIntents$0+470)
2018-09-13 13:56:25.629 22429-22429/? A/DEBUG:     #22 pc 005f0d52  /system/lib/libart.so (art_quick_invoke_static_stub+418)
2018-09-13 13:56:25.629 22429-22429/? A/DEBUG:     #23 pc 000a30df  /system/lib/libart.so (art::ArtMethod::Invoke(art::Thread*, unsigned int*, unsigned int, art::JValue*, char const*)+239)
2018-09-13 13:56:25.629 22429-22429/? A/DEBUG:     #24 pc 0029bca2  /system/lib/libart.so (art::interpreter::ArtInterpreterToCompiledCodeBridge(art::Thread*, art::ArtMethod*, art::ShadowFrame*, unsigned short, art::JValue*)+338)
2018-09-13 13:56:25.629 22429-22429/? A/DEBUG:     #25 pc 00293e48  /system/lib/libart.so (bool art::interpreter::DoCall<false, false>(art::ArtMethod*, art::Thread*, art::ShadowFrame&, art::Instruction const*, unsigned short, art::JValue*)+1048)

Unrolled stacktrace:

********** Crash dump: **********
Build fingerprint: 'google/sdk_gphone_x86/generic_x86:9/PSR1.180720.012/4923214:userdebug/dev-keys'
pid: 22074, tid: 22140, name: RxAndroidHandle  >>> ru.fcb.mobilefieldcollection <<<
signal 6 (SIGABRT), code -6 (SI_TKILL), fault addr --------
Stack frame #00 pc 00000b59  [vdso:f6259000] (__kernel_vsyscall+9)
Stack frame #01 pc 0001fdf8  /system/lib/libc.so (syscall+40)
Stack frame #02 pc 00022ed3  /system/lib/libc.so (abort+115)
Stack frame #03 pc 00260254  /data/app/ru.fcb.mobilefieldcollection-tiBKHhQFE9EmReHKcG6ROw==/lib/x86/librealm-jni.so (__gnu_cxx::__verbose_terminate_handler()+452): Routine std::time_get<char, std::istreambuf_iterator<char, std::char_traits<char> > >::_M_extract_via_format(std::istreambuf_iterator<char, std::char_traits<char> >, std::istreambuf_iterator<char, std::char_traits<char> >, std::ios_base&, std::_Ios_Iostate&, tm*, char const*) const at /tmp/ndk-user/tmp/gnu-libstdc++/static-x86-4.9/include/bits/locale_facets_nonio.tcc:779
Stack frame #04 pc 0022d967  /data/app/ru.fcb.mobilefieldcollection-tiBKHhQFE9EmReHKcG6ROw==/lib/x86/librealm-jni.so (__cxxabiv1::__terminate(void (*)())+23): Routine long long realm::TableViewBase::aggregate<2, long long, long long, realm::Column<long long> >(long long (realm::Column<long long>::*)(unsigned int, unsigned int, unsigned int, unsigned int*) const, unsigned int, long long, unsigned int*) const [clone .isra.123] at table_view.cpp:?
Stack frame #05 pc 0025fe85  /data/app/ru.fcb.mobilefieldcollection-tiBKHhQFE9EmReHKcG6ROw==/lib/x86/librealm-jni.so: Routine std::time_get<char, std::istreambuf_iterator<char, std::char_traits<char> > >::_M_extract_via_format(std::istreambuf_iterator<char, std::char_traits<char> >, std::istreambuf_iterator<char, std::char_traits<char> >, std::ios_base&, std::_Ios_Iostate&, tm*, char const*) const at /tmp/ndk-user/tmp/gnu-libstdc++/static-x86-4.9/include/bits/locale_facets_nonio.tcc:627
Stack frame #06 pc 0022d0c1  /data/app/ru.fcb.mobilefieldcollection-tiBKHhQFE9EmReHKcG6ROw==/lib/x86/librealm-jni.so (__gxx_personality_v0+321): Routine realm::TableViewBase::apply_patch(realm::TableViewHandoverPatch&, realm::Group&) at unwind-dw2-fde-dip.c:?
Stack frame #07 pc 002741b8  /data/app/ru.fcb.mobilefieldcollection-tiBKHhQFE9EmReHKcG6ROw==/lib/x86/librealm-jni.so: Routine std::basic_string<wchar_t, std::char_traits<wchar_t>, std::allocator<wchar_t> >::_Rep::_M_dispose(std::allocator<wchar_t> const&) at /tmp/ndk-user/tmp/gnu-libstdc++/static-x86-4.9/include/bits/basic_string.h:249
Stack frame #08 pc 00274606  /data/app/ru.fcb.mobilefieldcollection-tiBKHhQFE9EmReHKcG6ROw==/lib/x86/librealm-jni.so (_Unwind_Resume+92): Routine std::moneypunct<wchar_t, true>::do_curr_symbol() const at /tmp/ndk-user/tmp/gnu-libstdc++/static-x86-4.9/include/bits/locale_facets_nonio.h:1208 (discriminator 1)
Stack frame #09 pc 001183e0  /data/app/ru.fcb.mobilefieldcollection-tiBKHhQFE9EmReHKcG6ROw==/lib/x86/librealm-jni.so: Routine realm::_impl::ResultsNotifier::ResultsNotifier(realm::Results&) at unwind-dw2-fde-dip.c:?
Stack frame #10 pc 00118ac0  /data/app/ru.fcb.mobilefieldcollection-tiBKHhQFE9EmReHKcG6ROw==/lib/x86/librealm-jni.so: Routine realm::_impl::ResultsNotifier::calculate_changes() at unwind-dw2-fde-dip.c:?
Stack frame #11 pc 00117661  /data/app/ru.fcb.mobilefieldcollection-tiBKHhQFE9EmReHKcG6ROw==/lib/x86/librealm-jni.so: Routine realm::_impl::ResultsNotifier::do_prepare_handover(realm::SharedGroup&) at unwind-dw2-fde-dip.c:?
Stack frame #12 pc 001ba3a8  /data/app/ru.fcb.mobilefieldcollection-tiBKHhQFE9EmReHKcG6ROw==/lib/x86/librealm-jni.so: Routine void std::vector<realm::Group::CascadeNotification::row, std::allocator<realm::Group::CascadeNotification::row> >::_M_insert_aux<realm::Group::CascadeNotification::row const&>(__gnu_cxx::__normal_iterator<realm::Group::CascadeNotification::row*, std::vector<realm::Group::CascadeNotification::row, std::allocator<realm::Group::CascadeNotification::row> > >, realm::Group::CascadeNotification::row const&) at unwind-dw2-fde-dip.c:?
Stack frame #13 pc 00225cac  /data/app/ru.fcb.mobilefieldcollection-tiBKHhQFE9EmReHKcG6ROw==/lib/x86/librealm-jni.so: Routine .L4733 at table.cpp:?
Stack frame #14 pc 0020735a  /data/app/ru.fcb.mobilefieldcollection-tiBKHhQFE9EmReHKcG6ROw==/lib/x86/librealm-jni.so: Routine bool realm::Array::find_optimized<realm::NotEqual, (realm::Action)7, 1u, std::_Bind<std::_Mem_fn<bool (realm::ColumnNodeBase::*)(long long)> (realm::IntegerNodeBase<realm::Column<realm::util::Optional<long long> > >*, std::_Placeholder<1>)> >(long long, unsigned int, unsigned int, unsigned int, realm::QueryState<long long>*, std::_Bind<std::_Mem_fn<bool (realm::ColumnNodeBase::*)(long long)> (realm::IntegerNodeBase<realm::Column<realm::util::Optional<long long> > >*, std::_Placeholder<1>)>, bool, bool) const at unwind-dw2-fde-dip.c:?
Stack frame #15 pc 000a31e9  /data/app/ru.fcb.mobilefieldcollection-tiBKHhQFE9EmReHKcG6ROw==/lib/x86/librealm-jni.so (Java_io_realm_internal_UncheckedRow_nativeGetString+89): Routine realm::Compare<realm::NotEqual, realm::BinaryData, realm::Subexpr, realm::Subexpr>::description(realm::util::serializer::SerialisationState&) const at unwind-dw2-fde-dip.c:?
Stack frame #16 pc 000f1110  /dev/ashmem/dalvik-jit-code-cache (deleted) (io.realm.internal.UncheckedRow.nativeGetString+144)
Stack frame #17 pc 000b4185  /dev/ashmem/dalvik-jit-code-cache (deleted) (io.realm.internal.UncheckedRow.getString+69)
Stack frame #18 pc 000dee34  /dev/ashmem/dalvik-jit-code-cache (deleted) (io.realm.ru_fcb_mobilefieldcollection_debt_DebtRealmProxy.realmGet$contragent+308)
Stack frame #19 pc 0003a4d8  /dev/ashmem/dalvik-jit-code-cache (deleted) (ru.fcb.mobilefieldcollection.debt.Debt.getContragent+216)
Stack frame #20 pc 000504d5  /dev/ashmem/dalvik-jit-code-cache (deleted) (ru.fcb.mobilefieldcollection.alldebts.AllDebtsPresenter.convertToUIDebt+725)
Stack frame #21 pc 0005afb6  /dev/ashmem/dalvik-jit-code-cache (deleted) (ru.fcb.mobilefieldcollection.alldebts.AllDebtsPresenter.lambda$bindIntents$0+470)
Stack frame #22 pc 005f0d52  /system/lib/libart.so (art_quick_invoke_static_stub+418)
Stack frame #23 pc 000a30df  /system/lib/libart.so (art::ArtMethod::Invoke(art::Thread*, unsigned int*, unsigned int, art::JValue*, char const*)+239)
Stack frame #24 pc 0029bca2  /system/lib/libart.so (art::interpreter::ArtInterpreterToCompiledCodeBridge(art::Thread*, art::ArtMethod*, art::ShadowFrame*, unsigned short, art::JValue*)+338)
Stack frame #25 pc 00293e48  /system/lib/libart.so (bool art::interpreter::DoCall<false, false>(art::ArtMethod*, art::Thread*, art::ShadowFrame&, art::Instruction const*, unsigned short, art::JValue*)+1048)
vladimirfx commented 5 years ago

Yes, this is one of many similar stacktraces where one of text property of Debt or Debtor entity are accessed. It can happened on different threads.

vladimirfx commented 5 years ago

I'll checked assuption that crash depends on string encoding: Actually it is not. Moreover, on ASCII only text crash happened sooner than with Cyrillic text. Can send one more corrupted DB.

cmelchior commented 5 years ago

That would be great. Thank you 👏

cmelchior commented 5 years ago

@vladimirfx Thank you for the file

I can reproduce a crash when looping over all Debt objects reading out all String fields. The first crash I encounter is here:

2019-03-13 00:35:47.282 8797-8813/? E/REALM_JAVA: Read: 'comment' on debtId: 44294208
2019-03-13 00:35:48.016 8797-8813/? A/libc: Fatal signal 11 (SIGSEGV), code 1, fault addr 0xd4a80000 in tid 8813 (roidJUnitRunner)
2019-03-13 00:35:48.073 8818-8818/? A/DEBUG: *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
2019-03-13 00:35:48.073 8818-8818/? A/DEBUG: Build fingerprint: 'Android/vbox86p/vbox86p:7.1.1/NMF26Q/genymo09291150:userdebug/test-keys'
2019-03-13 00:35:48.073 8818-8818/? A/DEBUG: Revision: '0'
2019-03-13 00:35:48.073 8818-8818/? A/DEBUG: ABI: 'x86'
2019-03-13 00:35:48.073 8818-8818/? A/DEBUG: pid: 8797, tid: 8813, name: roidJUnitRunner  >>> io.realm.test <<<
2019-03-13 00:35:48.073 8818-8818/? A/DEBUG: signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 0xd4a80000
2019-03-13 00:35:48.073 8818-8818/? A/DEBUG:     eax d3cca4f7  ebx d5377dd0  ecx d5795244  edx d5370c4c
2019-03-13 00:35:48.073 8818-8818/? A/DEBUG:     esi d4a80000  edi d579505c
2019-03-13 00:35:48.073 8818-8818/? A/DEBUG:     xcs 00000023  xds 0000002b  xes 0000002b  xfs 0000006b  xss 0000002b
2019-03-13 00:35:48.073 8818-8818/? A/DEBUG:     eip d4e23c62  ebp d5795128  esp d5795040  flags 00010216
2019-03-13 00:35:48.074 8818-8818/? A/DEBUG: backtrace:
2019-03-13 00:35:48.074 8818-8818/? A/DEBUG:     #00 pc 00129c62  /data/app/io.realm.test-1/lib/x86/librealm-jni.so (_ZL13string_to_hexRKSsRN5realm10StringDataEPKcS5_PtS6_jj+264)
2019-03-13 00:35:48.074 8818-8818/? A/DEBUG:     #01 pc 0012a7df  /data/app/io.realm.test-1/lib/x86/librealm-jni.so (_Z10to_jstringP7_JNIEnvN5realm10StringDataE+1471)
2019-03-13 00:35:48.074 8818-8818/? A/DEBUG:     #02 pc 0011d8fe  /data/app/io.realm.test-1/lib/x86/librealm-jni.so (Java_io_realm_internal_UncheckedRow_nativeGetString+574)
2019-03-13 00:35:48.074 8818-8818/? A/DEBUG:     #03 pc 000970ef  /data/app/io.realm.test-1/lib/x86/librealm-jni.so (Java_io_realm_internal_CheckedRow_nativeGetString+113)
2019-03-13 00:35:48.074 8818-8818/? A/DEBUG:     #04 pc 00a89f74  /data/app/io.realm.test-1/oat/x86/base.odex (offset 0x9cc000)

At first glance, it appears to be a bug in Javas string_to_hex method, but if I read the String in Realm Studio it looks empty (which probably mean that Javascript also fails to read the string but just fails silently). I'll debug futher.

vladimirfx commented 5 years ago

I was sent another corrupted db from Android 9 device. It's crashes on database opening. crush_log.txt

ggajews commented 5 years ago

@cmelchior any progress in debugging this issue?

sirivanleo commented 5 years ago

another report this time user reports app just crashed repeatedly but I couldn't get more info besides native stack trace. We are using android realm 5.8.0 with encryption no cloud db.

pid: 0, tid: 0 >>> com.our.app <<<

backtrace:
  #00  pc 000000000003e1c0  /data/app/com.our.app-6FDq9SxoM5L2Ul5CPNbniQ==/lib/arm64/librealm-jni.so
  #01  pc 00000000000c93b8  /data/app/com.our.app-6FDq9SxoM5L2Ul5CPNbniQ==/lib/arm64/librealm-jni.so
  #02  pc 00000000000be8e0  /data/app/com.our.app-6FDq9SxoM5L2Ul5CPNbniQ==/lib/arm64/librealm-jni.so (Java_io_realm_internal_UncheckedRow_nativeGetString+92)
  #03  pc 0000000000a2df8c  /data/app/com.our.app-6FDq9SxoM5L2Ul5CPNbniQ==/oat/arm64/base.odex
jbkielis commented 5 years ago

We are running into the same issue. Any updates or ideas from the Realm team? Thanks.

mariusmorabosch commented 5 years ago

Same issue here, members of my team also contacted realm but no word update on that. There's a lot of similar issues open with native crashes but doesn't seem like the Realm team is giving any insights or feedback on them, only asking for samples, etc. Any update on this?

Practically all the bugs on the affected app are caused by these realm issues

vladimirfx commented 5 years ago

Our team has active paid support. We provide 2 or 3 corrupted databases with encryption keys. Unfortunately, no update. And now we consider painful database switch, because of this and similar native crashes. That sad...

EdwardvanRaak commented 5 years ago

Has anyone tries reverting back to a version from (for example) last year as some kind of last resort measure?

I believe in most cases reverting back would not require a migration?

Zhuinden commented 5 years ago

Realm schema file format upgrades are not backwards compatible.

bmunkholm commented 5 years ago

Really sorry about the lacking update here! We had an internal issue tracking the detailed troubleshooting efforts and missed getting back on this one. The status is that although we can reproduce the crash with the seemingly corrupt files provided, we unfortunately still have no leads on how it could end up in that state. There is nothing more we would like than to be able to replicate the issue and track back the root cause. Doing that is an iterative troubleshooting exercise that requires us to be able to replicate getting from a good state to the bad state. In the past, we have experienced some of these "hard to track bugs" to be deep down in the OS. , and it's not entirely trivial to pinpoint that without being able to replicate the issue.

The only things that could help us with this would be to get very detailed stats about where this issue is occurring (Device and OS versions), to give us hints if this is really OS dependent. And of course better, to have a way to replicate the issue from a good state (not just replicate the crash happening from a bad state). But even getting more realm files to see a pattern would be potentially useful. We highly appreciate your understanding and help in resolving this.

vladimirfx commented 5 years ago

OS versions for our app API 23 to 28. Vendors: Xiaomi, Samsunt, Lenovo, Huawai, Asus. Can you provide special debug build of Realm? We can try reproduce issues on it.

bmunkholm commented 5 years ago

Sure we can provide a version with all assertions turned on and hope to catch this earlier. Are you able to reproduce this internally? Can it also be reproduced when not using encryption?

vladimirfx commented 5 years ago

Sure we can provide a version with all assertions turned on and hope to catch this earlier.

Binaries coordinates ?

Are you able to reproduce this internally?

Yes, our team already sent 3 corrupted DBs.

Can it also be reproduced when not using encryption?

We use Realm in several projects. This distinct bug reproducible only in projects with enabled encryption.

bmunkholm commented 5 years ago

@vladimirfx It's not something we have available by default, but something we will have to build. @cmelchior should be able to get that together for you. I'll let him comment on when that can be done.

It is interesting if this only happens for encrypted realms. Has anyone else seen this in non-encrypted realms?

ggajews commented 5 years ago

Stats from us: Os version:

Android 8.1 605 34.5%
Android 8.0 446 25.4%
Android 7.0 400 22.8%
Android 6.0 289 16.5%

Devices:

Galaxy XCover4 (xcover4lte) 603 34.4%
LG K8 LTE (mm1vn) 264 15.0%
Galaxy J5 (j5y17lte) 171 9.7%
Galaxy S7 edge (hero2lte) 120 6.8%
Xperia L1 (G3311) 91 5.2%

Realm IS encrypted

xMickeymikex commented 5 years ago

@bmunkholm I Have also encountered this issue using encrypted realm. Although I cannot replicate it using non-encrypted realms. I described it here: https://github.com/realm/realm-java/issues/6562

EdwardvanRaak commented 5 years ago

I have no knowledge about the inner workings of Realm-Core but I want to understand why realm files cannot be validated internally to not contain corrupt or erroneous data and auto-recover in a clean slate. The fact that realm files can go "corrupt" and require a full uninstall/reinstall of the application is very bad for such a widely used database.

Also, has anyone tried to build their own encryption layer/interface and disable the encryption done by realm? We would have tried something like this if it wasn't for our encryption key rotating every so often.

Zhuinden commented 5 years ago

The fact that realm files can go "corrupt" and require a full uninstall/reinstall of the application is very bad for such a widely used database.

Alternately you can catch the exception and delete the Realm file if it fails to open, but that's clearly not something you want to do for something that stores important data that you only have locally. Also problematic if for example it'd break inside a Realm file on the ROS.

Personally I would expect Realm to be able to detect that a transaction would cause corruption, instead of validating only on start-up after it is too late to do any recovery (as the transaction history is also in the Realm file).

xMickeymikex commented 5 years ago

Alternately you can catch the exception and delete the Realm file if it fails to open, but that's clearly not something you want to do for something that stores important data that you only have locally.

Well that's not any solution, except you get rid of the crashes. Not even saying that catching native crashes is problematic.

Personally I would expect Realm to be able to detect that a transaction would cause corruption, instead of validating only on start-up after it is too late to do any recovery (as the transaction history is also in the Realm file).

That could actually be very helpful and maybe a "quickest" solution to the problem, but I still don't understand why these transaction errors occur only on encrypted db's.

waqas-ansari commented 5 years ago

I have this crash too. But it's happening on totally correct db and rarely.

Device: OnePlus 3T Android Version: 9.0 Android Studio Version: 3.4.2 Realm Version: 5.10.0

Below is the only code which is causing this:

RealmManagerUtil realmManager = Injector.get().realmManager();
realmManager.runTransaction(new Realm.Transaction() {
    @Override
    public void execute(@NonNull Realm realm) {
        RealmResults<TransactionHistory> results = realm
                .where(TransactionHistory.class)
                .beginGroup()
                .equalTo(TransactionHistory.FIELD_FROM, contactId)
                .or()
                .equalTo(TransactionHistory.FIELD_TO, contactId)
                .endGroup()
                .greaterThanOrEqualTo(TransactionHistory.FIELD_SERIAL_NUMBER, Session.NEGATIVE_S_NO_THRESHOLD)
                .findAll()
                .sort(TransactionHistory.FIELD_SERIAL_NUMBER);
        if (results != null && results.size() > 0) {
            for (TransactionHistory history : results) {
                Long sNo = Session.getAndUpdateSerialNumber();
                Logger.showError("WOP-12136 - reassignSNoToLargerSNo - Txn No: " + history.getTransactionNumber() +
                        ", Amount: " + history.getAmount() + ", Old S. No.: " + history.getSerialNumber() + ", New S. No.: " + sNo);
                history.setSerialNumber(sNo);
            }
        }
    }
}, new Realm.Transaction.OnSuccess() {
    @Override
    public void onSuccess() {
        if(onSuccess != null) onSuccess.onSuccess();
    }
}, new Realm.Transaction.OnError() {
    @Override
    public void onError(Throwable error) {

    }
}, true);

Note: true in last line is indicating that transaction is async.

Below is the crash I am experiencing.

    --------- beginning of crash
2019-08-09 10:26:12.014 15580-15881/? A/libc: Fatal signal 11 (SIGSEGV), code 2 (SEGV_ACCERR), fault addr 0x7b152af000 in tid 15881 (pool-5-thread-8), pid 15580 (o.foree.app.dev)
2019-08-09 10:26:12.148 15889-15889/? A/DEBUG: *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
2019-08-09 10:26:12.149 15889-15889/? A/DEBUG: Build fingerprint: 'OnePlus/OnePlus3/OnePlus3T:9/PKQ1.181203.001/1906252001:user/release-keys'
2019-08-09 10:26:12.149 15889-15889/? A/DEBUG: Revision: '0'
2019-08-09 10:26:12.149 15889-15889/? A/DEBUG: ABI: 'arm64'
2019-08-09 10:26:12.149 15889-15889/? A/DEBUG: pid: 15580, tid: 15881, name: pool-5-thread-8  >>> co.foree.app.dev <<<
2019-08-09 10:26:12.149 15889-15889/? A/DEBUG: signal 11 (SIGSEGV), code 2 (SEGV_ACCERR), fault addr 0x7b152af000
2019-08-09 10:26:12.149 15889-15889/? A/DEBUG:     x0  0000007b152aeff0  x1  0000007b1451e780  x2  0000000000000e30  x3  0000000000000034
2019-08-09 10:26:12.149 15889-15889/? A/DEBUG:     x4  0000000034643591  x5  00000000000000c3  x6  00000000000000b9  x7  0000007b19defadc
2019-08-09 10:26:12.149 15889-15889/? A/DEBUG:     x8  00000000e600000d  x9  0000000000000000  x10 0000007b1451ec80  x11 0000007b152ae6fc
2019-08-09 10:26:12.149 15889-15889/? A/DEBUG:     x12 000000000000000a  x13 fffffffffffffffa  x14 0000000000000cb0  x15 0000000000000002
2019-08-09 10:26:12.149 15889-15889/? A/DEBUG:     x16 0000007b1818f308  x17 0000007bb4d9f4b8  x18 00000000000000c2  x19 0000000000000e68
2019-08-09 10:26:12.149 15889-15889/? A/DEBUG:     x20 0000007b152ae6f8  x21 000000000002a6f8  x22 000000000002a6f8  x23 0000007b150483a0
2019-08-09 10:26:12.149 15889-15889/? A/DEBUG:     x24 0000007b15284000  x25 0000007b1451de48  x26 0000000041414141  x27 0000000000000002
2019-08-09 10:26:12.149 15889-15889/? A/DEBUG:     x28 0000007b118f69d0  x29 0000007b118f6860
2019-08-09 10:26:12.149 15889-15889/? A/DEBUG:     sp  0000007b118f6040  lr  0000007b180130ac  pc  0000007bb4d9f6b8
2019-08-09 10:26:12.202 15889-15889/? A/DEBUG: backtrace:
2019-08-09 10:26:12.202 15889-15889/? A/DEBUG:     #00 pc 000000000001f6b8  /system/lib64/libc.so (memcpy+512)
2019-08-09 10:26:12.202 15889-15889/? A/DEBUG:     #01 pc 00000000001bd0a8  /data/app/co.foree.app.dev-_JgdodELP-UlOvPDtS2S9A==/lib/arm64/librealm-jni.so
2019-08-09 10:26:12.202 15889-15889/? A/DEBUG:     #02 pc 0000000000140ba4  /data/app/co.foree.app.dev-_JgdodELP-UlOvPDtS2S9A==/lib/arm64/librealm-jni.so
2019-08-09 10:26:12.202 15889-15889/? A/DEBUG:     #03 pc 0000000000143d70  /data/app/co.foree.app.dev-_JgdodELP-UlOvPDtS2S9A==/lib/arm64/librealm-jni.so
2019-08-09 10:26:12.202 15889-15889/? A/DEBUG:     #04 pc 0000000000143d84  /data/app/co.foree.app.dev-_JgdodELP-UlOvPDtS2S9A==/lib/arm64/librealm-jni.so
2019-08-09 10:26:12.202 15889-15889/? A/DEBUG:     #05 pc 0000000000143d84  /data/app/co.foree.app.dev-_JgdodELP-UlOvPDtS2S9A==/lib/arm64/librealm-jni.so
2019-08-09 10:26:12.202 15889-15889/? A/DEBUG:     #06 pc 00000000001bda8c  /data/app/co.foree.app.dev-_JgdodELP-UlOvPDtS2S9A==/lib/arm64/librealm-jni.so
2019-08-09 10:26:12.202 15889-15889/? A/DEBUG:     #07 pc 00000000001b91f8  /data/app/co.foree.app.dev-_JgdodELP-UlOvPDtS2S9A==/lib/arm64/librealm-jni.so
2019-08-09 10:26:12.202 15889-15889/? A/DEBUG:     #08 pc 00000000001b94a0  /data/app/co.foree.app.dev-_JgdodELP-UlOvPDtS2S9A==/lib/arm64/librealm-jni.so
2019-08-09 10:26:12.202 15889-15889/? A/DEBUG:     #09 pc 00000000001bb314  /data/app/co.foree.app.dev-_JgdodELP-UlOvPDtS2S9A==/lib/arm64/librealm-jni.so
2019-08-09 10:26:12.202 15889-15889/? A/DEBUG:     #10 pc 000000000011b698  /data/app/co.foree.app.dev-_JgdodELP-UlOvPDtS2S9A==/lib/arm64/librealm-jni.so
2019-08-09 10:26:12.202 15889-15889/? A/DEBUG:     #11 pc 00000000001003d8  /data/app/co.foree.app.dev-_JgdodELP-UlOvPDtS2S9A==/lib/arm64/librealm-jni.so
2019-08-09 10:26:12.202 15889-15889/? A/DEBUG:     #12 pc 000000000007cc60  /data/app/co.foree.app.dev-_JgdodELP-UlOvPDtS2S9A==/lib/arm64/librealm-jni.so (Java_io_realm_internal_OsSharedRealm_nativeCommitTransaction+48)
2019-08-09 10:26:12.202 15889-15889/? A/DEBUG:     #13 pc 000000000056a1e0  /system/lib64/libart.so (art_quick_generic_jni_trampoline+144)
2019-08-09 10:26:12.202 15889-15889/? A/DEBUG:     #14 pc 000000000056144c  /system/lib64/libart.so (art_quick_invoke_static_stub+604)
2019-08-09 10:26:12.202 15889-15889/? A/DEBUG:     #15 pc 00000000000cf6d8  /system/lib64/libart.so (art::ArtMethod::Invoke(art::Thread*, unsigned int*, unsigned int, art::JValue*, char const*)+232)
2019-08-09 10:26:12.202 15889-15889/? A/DEBUG:     #16 pc 0000000000282b00  /system/lib64/libart.so (art::interpreter::ArtInterpreterToCompiledCodeBridge(art::Thread*, art::ArtMethod*, art::ShadowFrame*, unsigned short, art::JValue*)+344)
2019-08-09 10:26:12.203 15889-15889/? A/DEBUG:     #17 pc 000000000027cbb0  /system/lib64/libart.so (bool art::interpreter::DoCall<false, false>(art::ArtMethod*, art::Thread*, art::ShadowFrame&, art::Instruction const*, unsigned short, art::JValue*)+960)
2019-08-09 10:26:12.203 15889-15889/? A/DEBUG:     #18 pc 00000000005317d8  /system/lib64/libart.so (MterpInvokeStatic+200)
2019-08-09 10:26:12.203 15889-15889/? A/DEBUG:     #19 pc 0000000000553a14  /system/lib64/libart.so (ExecuteMterpImpl+14612)
2019-08-09 10:26:12.203 15889-15889/? A/DEBUG:     #20 pc 000000000039c380  /dev/ashmem/dalvik-classes.dex extracted in memory from /data/app/co.foree.app.dev-_JgdodELP-UlOvPDtS2S9A==/base.apk (deleted) (io.realm.internal.OsSharedRealm.commitTransaction+4)
2019-08-09 10:26:12.203 15889-15889/? A/DEBUG:     #21 pc 0000000000256d10  /system/lib64/libart.so (_ZN3art11interpreterL7ExecuteEPNS_6ThreadERKNS_20CodeItemDataAccessorERNS_11ShadowFrameENS_6JValueEb.llvm.3399805955+488)
2019-08-09 10:26:12.203 15889-15889/? A/DEBUG:     #22 pc 000000000025c8c0  /system/lib64/libart.so (art::interpreter::ArtInterpreterToInterpreterBridge(art::Thread*, art::CodeItemDataAccessor const&, art::ShadowFrame*, art::JValue*)+216)
2019-08-09 10:26:12.203 15889-15889/? A/DEBUG:     #23 pc 000000000027cb94  /system/lib64/libart.so (bool art::interpreter::DoCall<false, false>(art::ArtMethod*, art::Thread*, art::ShadowFrame&, art::Instruction const*, unsigned short, art::JValue*)+932)
2019-08-09 10:26:12.203 15889-15889/? A/DEBUG:     #24 pc 0000000000530310  /system/lib64/libart.so (MterpInvokeVirtual+576)
2019-08-09 10:26:12.203 15889-15889/? A/DEBUG:     #25 pc 0000000000553894  /system/lib64/libart.so (ExecuteMterpImpl+14228)
2019-08-09 10:26:12.203 15889-15889/? A/DEBUG:     #26 pc 00000000003842ca  /dev/ashmem/dalvik-classes.dex extracted in memory from /data/app/co.foree.app.dev-_JgdodELP-UlOvPDtS2S9A==/base.apk (deleted) (io.realm.BaseRealm.commitTransaction+10)
2019-08-09 10:26:12.203 15889-15889/? A/DEBUG:     #27 pc 0000000000256d10  /system/lib64/libart.so (_ZN3art11interpreterL7ExecuteEPNS_6ThreadERKNS_20CodeItemDataAccessorERNS_11ShadowFrameENS_6JValueEb.llvm.3399805955+488)
2019-08-09 10:26:12.203 15889-15889/? A/DEBUG:     #28 pc 000000000025c8c0  /system/lib64/libart.so (art::interpreter::ArtInterpreterToInterpreterBridge(art::Thread*, art::CodeItemDataAccessor const&, art::ShadowFrame*, art::JValue*)+216)
2019-08-09 10:26:12.203 15889-15889/? A/DEBUG:     #29 pc 000000000027cb94  /system/lib64/libart.so (bool art::interpreter::DoCall<false, false>(art::ArtMethod*, art::Thread*, art::ShadowFrame&, art::Instruction const*, unsigned short, art::JValue*)+932)
2019-08-09 10:26:12.203 15889-15889/? A/DEBUG:     #30 pc 0000000000530990  /system/lib64/libart.so (MterpInvokeSuper+1396)
2019-08-09 10:26:12.203 15889-15889/? A/DEBUG:     #31 pc 0000000000553914  /system/lib64/libart.so (ExecuteMterpImpl+14356)
2019-08-09 10:26:12.203 15889-15889/? A/DEBUG:     #32 pc 0000000000396320  /dev/ashmem/dalvik-classes.dex extracted in memory from /data/app/co.foree.app.dev-_JgdodELP-UlOvPDtS2S9A==/base.apk (deleted) (io.realm.Realm.commitTransaction)
2019-08-09 10:26:12.203 15889-15889/? A/DEBUG:     #33 pc 0000000000256d10  /system/lib64/libart.so (_ZN3art11interpreterL7ExecuteEPNS_6ThreadERKNS_20CodeItemDataAccessorERNS_11ShadowFrameENS_6JValueEb.llvm.3399805955+488)
2019-08-09 10:26:12.203 15889-15889/? A/DEBUG:     #34 pc 000000000025c8c0  /system/lib64/libart.so (art::interpreter::ArtInterpreterToInterpreterBridge(art::Thread*, art::CodeItemDataAccessor const&, art::ShadowFrame*, art::JValue*)+216)
2019-08-09 10:26:12.203 15889-15889/? A/DEBUG:     #35 pc 000000000027cb94  /system/lib64/libart.so (bool art::interpreter::DoCall<false, false>(art::ArtMethod*, art::Thread*, art::ShadowFrame&, art::Instruction const*, unsigned short, art::JValue*)+932)
2019-08-09 10:26:12.203 15889-15889/? A/DEBUG:     #36 pc 0000000000530310  /system/lib64/libart.so (MterpInvokeVirtual+576)
2019-08-09 10:26:12.203 15889-15889/? A/DEBUG:     #37 pc 0000000000553894  /system/lib64/libart.so (ExecuteMterpImpl+14228)
2019-08-09 10:26:12.203 15889-15889/? A/DEBUG:     #38 pc 000000000038b618  /dev/ashmem/dalvik-classes.dex extracted in memory from /data/app/co.foree.app.dev-_JgdodELP-UlOvPDtS2S9A==/base.apk (deleted) (io.realm.Realm$1.run+116)
2019-08-09 10:26:12.203 15889-15889/? A/DEBUG:     #39 pc 0000000000256d10  /system/lib64/libart.so (_ZN3art11interpreterL7ExecuteEPNS_6ThreadERKNS_20CodeItemDataAccessorERNS_11ShadowFrameENS_6JValueEb.llvm.3399805955+488)
2019-08-09 10:26:12.203 15889-15889/? A/DEBUG:     #40 pc 000000000025c8c0  /system/lib64/libart.so (art::interpreter::ArtInterpreterToInterpreterBridge(art::Thread*, art::CodeItemDataAccessor const&, art::ShadowFrame*, art::JValue*)+216)
2019-08-09 10:26:12.203 15889-15889/? A/DEBUG:     #41 pc 000000000027cb94  /system/lib64/libart.so (bool art::interpreter::DoCall<false, false>(art::ArtMethod*, art::Thread*, art::ShadowFrame&, art::Instruction const*, unsigned short, art::JValue*)+932)
2019-08-09 10:26:12.203 15889-15889/? A/DEBUG:     #42 pc 0000000000531264  /system/lib64/libart.so (MterpInvokeInterface+1376)
2019-08-09 10:26:12.203 15889-15889/? A/DEBUG:     #43 pc 0000000000553a94  /system/lib64/libart.so (ExecuteMterpImpl+14740)
2019-08-09 10:26:12.203 15889-15889/? A/DEBUG:     #44 pc 00000000003a16b2  /dev/ashmem/dalvik-classes.dex extracted in memory from /data/app/co.foree.app.dev-_JgdodELP-UlOvPDtS2S9A==/base.apk (deleted) (io.realm.internal.async.BgPriorityRunnable.run+14)
2019-08-09 10:26:12.203 15889-15889/? A/DEBUG:     #45 pc 0000000000256d10  /system/lib64/libart.so (_ZN3art11interpreterL7ExecuteEPNS_6ThreadERKNS_20CodeItemDataAccessorERNS_11ShadowFrameENS_6JValueEb.llvm.3399805955+488)
2019-08-09 10:26:12.203 15889-15889/? A/DEBUG:     #46 pc 000000000025c8c0  /system/lib64/libart.so (art::interpreter::ArtInterpreterToInterpreterBridge(art::Thread*, art::CodeItemDataAccessor const&, art::ShadowFrame*, art::JValue*)+216)
2019-08-09 10:26:12.203 15889-15889/? A/DEBUG:     #47 pc 000000000027cb94  /system/lib64/libart.so (bool art::interpreter::DoCall<false, false>(art::ArtMethod*, art::Thread*, art::ShadowFrame&, art::Instruction const*, unsigned short, art::JValue*)+932)
2019-08-09 10:26:12.203 15889-15889/? A/DEBUG:     #48 pc 0000000000531264  /system/lib64/libart.so (MterpInvokeInterface+1376)
2019-08-09 10:26:12.203 15889-15889/? A/DEBUG:     #49 pc 0000000000553a94  /system/lib64/libart.so (ExecuteMterpImpl+14740)
2019-08-09 10:26:12.203 15889-15889/? A/DEBUG:     #50 pc 0000000000112f9c  /system/framework/boot-core-oj.vdex (java.util.concurrent.Executors$RunnableAdapter.call+4)
2019-08-09 10:26:12.203 15889-15889/? A/DEBUG:     #51 pc 0000000000256d10  /system/lib64/libart.so (_ZN3art11interpreterL7ExecuteEPNS_6ThreadERKNS_20CodeItemDataAccessorERNS_11ShadowFrameENS_6JValueEb.llvm.3399805955+488)
2019-08-09 10:26:12.203 15889-15889/? A/DEBUG:     #52 pc 000000000025c8c0  /system/lib64/libart.so (art::interpreter::ArtInterpreterToInterpreterBridge(art::Thread*, art::CodeItemDataAccessor const&, art::ShadowFrame*, art::JValue*)+216)
2019-08-09 10:26:12.203 15889-15889/? A/DEBUG:     #53 pc 000000000027cb94  /system/lib64/libart.so (bool art::interpreter::DoCall<false, false>(art::ArtMethod*, art::Thread*, art::ShadowFrame&, art::Instruction const*, unsigned short, art::JValue*)+932)
2019-08-09 10:26:12.203 15889-15889/? A/DEBUG:     #54 pc 0000000000531264  /system/lib64/libart.so (MterpInvokeInterface+1376)
2019-08-09 10:26:12.204 15889-15889/? A/DEBUG:     #55 pc 0000000000553a94  /system/lib64/libart.so (ExecuteMterpImpl+14740)
2019-08-09 10:26:12.204 15889-15889/? A/DEBUG:     #56 pc 00000000001137ca  /system/framework/boot-core-oj.vdex (java.util.concurrent.FutureTask.run+62)
2019-08-09 10:26:12.204 15889-15889/? A/DEBUG:     #57 pc 0000000000256d10  /system/lib64/libart.so (_ZN3art11interpreterL7ExecuteEPNS_6ThreadERKNS_20CodeItemDataAccessorERNS_11ShadowFrameENS_6JValueEb.llvm.3399805955+488)
2019-08-09 10:26:12.204 15889-15889/? A/DEBUG:     #58 pc 000000000025c8c0  /system/lib64/libart.so (art::interpreter::ArtInterpreterToInterpreterBridge(art::Thread*, art::CodeItemDataAccessor const&, art::ShadowFrame*, art::JValue*)+216)
2019-08-09 10:26:12.204 15889-15889/? A/DEBUG:     #59 pc 000000000027cb94  /system/lib64/libart.so (bool art::interpreter::DoCall<false, false>(art::ArtMethod*, art::Thread*, art::ShadowFrame&, art::Instruction const*, unsigned short, art::JValue*)+932)
2019-08-09 10:26:12.204 15889-15889/? A/DEBUG:     #60 pc 0000000000531264  /system/lib64/libart.so (MterpInvokeInterface+1376)
2019-08-09 10:26:12.204 15889-15889/? A/DEBUG:     #61 pc 0000000000553a94  /system/lib64/libart.so (ExecuteMterpImpl+14740)
2019-08-09 10:26:12.204 15889-15889/? A/DEBUG:     #62 pc 00000000001177b0  /system/framework/boot-core-oj.vdex (java.util.concurrent.ThreadPoolExecutor.runWorker+162)
2019-08-09 10:26:12.204 15889-15889/? A/DEBUG:     #63 pc 0000000000256d10  /system/lib64/libart.so (_ZN3art11interpreterL7ExecuteEPNS_6ThreadERKNS_20CodeItemDataAccessorERNS_11ShadowFrameENS_6JValueEb.llvm.3399805955+488)
2019-08-09 10:26:12.204 15889-15889/? A/DEBUG:     #64 pc 000000000025c8c0  /system/lib64/libart.so (art::interpreter::ArtInterpreterToInterpreterBridge(art::Thread*, art::CodeItemDataAccessor const&, art::ShadowFrame*, art::JValue*)+216)
2019-08-09 10:26:12.204 15889-15889/? A/DEBUG:     #65 pc 000000000027cb94  /system/lib64/libart.so (bool art::interpreter::DoCall<false, false>(art::ArtMethod*, art::Thread*, art::ShadowFrame&, art::Instruction const*, unsigned short, art::JValue*)+932)
2019-08-09 10:26:12.204 15889-15889/? A/DEBUG:     #66 pc 0000000000530310  /system/lib64/libart.so (MterpInvokeVirtual+576)
2019-08-09 10:26:12.204 15889-15889/? A/DEBUG:     #67 pc 0000000000553894  /system/lib64/libart.so (ExecuteMterpImpl+14228)
2019-08-09 10:26:12.204 15889-15889/? A/DEBUG:     #68 pc 0000000000116ade  /system/framework/boot-core-oj.vdex (java.util.concurrent.ThreadPoolExecutor$Worker.run+4)
2019-08-09 10:26:12.204 15889-15889/? A/DEBUG:     #69 pc 0000000000256d10  /system/lib64/libart.so (_ZN3art11interpreterL7ExecuteEPNS_6ThreadERKNS_20CodeItemDataAccessorERNS_11ShadowFrameENS_6JValueEb.llvm.3399805955+488)
2019-08-09 10:26:12.204 15889-15889/? A/DEBUG:     #70 pc 000000000025c8c0  /system/lib64/libart.so (art::interpreter::ArtInterpreterToInterpreterBridge(art::Thread*, art::CodeItemDataAccessor const&, art::ShadowFrame*, art::JValue*)+216)
2019-08-09 10:26:12.204 15889-15889/? A/DEBUG:     #71 pc 000000000027cb94  /system/lib64/libart.so (bool art::interpreter::DoCall<false, false>(art::ArtMethod*, art::Thread*, art::ShadowFrame&, art::Instruction const*, unsigned short, art::JValue*)+932)
2019-08-09 10:26:12.204 15889-15889/? A/DEBUG:     #72 pc 0000000000531264  /system/lib64/libart.so (MterpInvokeInterface+1376)
2019-08-09 10:26:12.204 15889-15889/? A/DEBUG:     #73 pc 0000000000553a94  /system/lib64/libart.so (ExecuteMterpImpl+14740)
2019-08-09 10:26:12.204 15889-15889/? A/DEBUG:     #74 pc 00000000000cae46  /system/framework/boot-core-oj.vdex (java.lang.Thread.run+12)
2019-08-09 10:26:12.204 15889-15889/? A/DEBUG:     #75 pc 0000000000256d10  /system/lib64/libart.so (_ZN3art11interpreterL7ExecuteEPNS_6ThreadERKNS_20CodeItemDataAccessorERNS_11ShadowFrameENS_6JValueEb.llvm.3399805955+488)
2019-08-09 10:26:12.204 15889-15889/? A/DEBUG:     #76 pc 0000000000520ad8  /system/lib64/libart.so (artQuickToInterpreterBridge+944)
2019-08-09 10:26:12.204 15889-15889/? A/DEBUG:     #77 pc 000000000056a2fc  /system/lib64/libart.so (art_quick_to_interpreter_bridge+92)
2019-08-09 10:26:12.204 15889-15889/? A/DEBUG:     #78 pc 0000000000561188  /system/lib64/libart.so (art_quick_invoke_stub+584)
2019-08-09 10:26:12.204 15889-15889/? A/DEBUG:     #79 pc 00000000000cf6b8  /system/lib64/libart.so (art::ArtMethod::Invoke(art::Thread*, unsigned int*, unsigned int, art::JValue*, char const*)+200)
2019-08-09 10:26:12.204 15889-15889/? A/DEBUG:     #80 pc 0000000000466690  /system/lib64/libart.so (art::(anonymous namespace)::InvokeWithArgArray(art::ScopedObjectAccessAlreadyRunnable const&, art::ArtMethod*, art::(anonymous namespace)::ArgArray*, art::JValue*, char const*)+104)
2019-08-09 10:26:12.204 15889-15889/? A/DEBUG:     #81 pc 0000000000467794  /system/lib64/libart.so (art::InvokeVirtualOrInterfaceWithJValues(art::ScopedObjectAccessAlreadyRunnable const&, _jobject*, _jmethodID*, jvalue*)+424)
2019-08-09 10:26:12.204 15889-15889/? A/DEBUG:     #82 pc 0000000000492dc8  /system/lib64/libart.so (art::Thread::CreateCallback(void*)+1116)
2019-08-09 10:26:12.204 15889-15889/? A/DEBUG:     #83 pc 0000000000099508  /system/lib64/libc.so (__pthread_start(void*)+36)
2019-08-09 10:26:12.204 15889-15889/? A/DEBUG:     #84 pc 0000000000023e18  /system/lib64/libc.so (__start_thread+68)
theverybest commented 4 years ago

Dear developers, when this bug will be fixed? Or any comment what to do?

yohanan commented 4 years ago

So...

The original incident was reported 1 year ago.

The last response from inside Realm was 3 months ago.

No workarounds have been provided.

Judging by the responses of others here, as well as our own experience, this issue does not seem limited to any specific makes, models, or operating system versions.

This issue continues to impact our users. Furthermore, a recent upgrade to realm-gradle-plugin (from 5.11.0 to 5.14.0) introduced different instabilities, forcing us to rollback.

I'm going to take the extraordinary and unfortunate step of completely removing Realm from our codebase. A separate project we recently spun up was planning to use Realm, but decided against it based on the instabilities we've experienced.

EdwardvanRaak commented 4 years ago

@yohanan What instabilities were you guys seeing after that upgrade? Because we also did the same recently and things got even worse here as well. Any crashes to share?

Anyways recently we saw a stacktrace of a native realm crash that occurred immediately after Fresco (our image library) did a com.facebook.imagepipeline.memory.NativeMemoryChunk.nativeFree so we literally replaced Fresco with Glide just to rule out anything weird happening there.

That wasn't the root issue though since our console is filling up with native crashes again but it just shows how desperate we are :sweat_smile:

Zhuinden commented 4 years ago

I wonder though, if this really is something that only surfaces when encryption is used.

Of course, encryption should just work.

yohanan commented 4 years ago

@yohanan What instabilities were you guys seeing after that upgrade? Because we also did the same recently and things got even worse here as well. Any crashes to share?

Anyways recently we saw a stacktrace of a native realm crash that occurred immediately after Fresco (our image library) did a com.facebook.imagepipeline.memory.NativeMemoryChunk.nativeFree so we literally replaced Fresco with Glide just to rule out anything weird happening there.

That wasn't the root issue though since our console is filling up with native crashes again but it just shows how desperate we are 😅

@EdwardvanRaak the latest instability was found via our functional testing. we run the same tests many times a day, and i happened to make only a change upgrading Realm from 5.11 to 5.14 and the functionals started crashing. going back to 5.11 corrected it.

it is a different issue, as far as i can tell. it is an "invalid mnemonic" error, like the following...

Caused by: io.realm.exceptions.RealmFileException: Unable to open a realm at path '<<redacted>>': Invalid mnemonic. top_ref[0]: 0, top_ref[1]: 11A78, mnemonic: 0 0 0 0, fmt[0]: 0, fmt[1]: 9, flags: 1 Path:. (Invalid mnemonic. top_ref[0]: 0, top_ref[1]: 11A78, mnemonic: 0 0 0 0, fmt[0]: 0, fmt[1]: 9, flags: 1 Path: /<<redacted>>/files/latch.realm) (/data/data/<<redacted>>/files/latch.realm) in /Users/jasonflax/Development/realm-java/realm/realm-library/src/main/cpp/io_realm_internal_OsSharedRealm.cpp line 101 Kind: ACCESS_ERROR. at io.realm.internal.OsSharedRealm.nativeGetSharedRealm(Native Method) at io.realm.internal.OsSharedRealm.<init>(OsSharedRealm.java:171) at io.realm.internal.OsSharedRealm.getInstance(OsSharedRealm.java:241) at io.realm.internal.OsSharedRealm.getInstance(OsSharedRealm.java:231) at io.realm.RealmCache.doCreateRealmOrGetFromCache(RealmCache.java:332) at io.realm.RealmCache.createRealmOrGetFromCache(RealmCache.java:285) at io.realm.Realm.getDefaultInstance(Realm.java:407)

we've been dealing with the db corruption issues for a very long time, and then this new issue reared its head, from a clean install.

yohanan commented 4 years ago

I wonder though, if this really is something that only surfaces when encryption is used.

Of course, encryption should just work.

@Zhuinden i also am curious about this. we do enable encryption. in fact, this was one of the main reasons for using Realm. however, having encryption but corrupting your database and having to reinstall the app sort of defeats the purpose. so, yes, i am curious if this is only from encryption.

sirivanleo commented 4 years ago

Yeah we gave up on fixing this issue, we just live with it now. We are planning to eventually move away from realm but it's a big burden. We basically turned realm into a cache layer(removed all sensitive data and put it elsewhere), anytime it corrupts or crashes we simply wipe and restart on the fly. I think this is the only known workaround.

Also I do believe this is a result of using encryption but I'm not completely sure, since we need encryption we accept that its just cache and its fine to delete and start over at any time. Honestly not a great answer, but we paid for support as well and got nowhere...so this is where we ended up.

yohanan commented 4 years ago

we've been using Realm for about 2.5+ years. i don't recall this issue until about a year ago, so i wonder if there was some sort of bug introduced in a particular version. the original post cites v5.3.1. i cannot say what version we started seeing it -- it could actually be a combination of a particular Realm version coupled with how we use it changing over time. i do remember a long stretch at the beginning, though, where there were no real issues.

internally, we've been wondering if this has anything to do with Strings. i haven't looked at these crash logs in depth lately, but we seem to only see it as corruption when dealing with Strings. of course, a lot of what we store are Strings, so may be simple probability. was considering migrating all our String properties to byte-arrays, to see if it had any impact. this is cumbersome, sure.

has Realm reached out to anyone? some time ago one of our developers submitted crash logs, corrupted databases, as well as detailed information, but i don't believe we ever heard back from Realm.

yohanan commented 4 years ago

Yeah we gave up on fixing this issue, we just live with it now. We are planning to eventually move away from realm but it's a big burden. We basically turned realm into a cache layer(removed all sensitive data and put it elsewhere), anytime it corrupts or crashes we simply wipe and restart on the fly. I think this is the only known workaround.

Also I do believe this is a result of using encryption but I'm not completely sure, since we need encryption we accept that its just cache and its fine to delete and start over at any time. Honestly not a great answer, but we paid for support as well and got nowhere...so this is where we ended up.

@sirivanleo (1) how do you catch these corruption exceptions? since most if not all are coming from the JNI, they are difficult to catch and respond.

(2) in the meantime, what is the mechanism you're using for sensitive info that you moved from Realm.

(3) when you move from Realm, what is your longterm approach for an encrypted db?

EdwardvanRaak commented 4 years ago

@theverybest @jbkielis @mariusmorabosch @waqas-ansari Do you guys also have encryption enabled?

Because I could let some of my users try an unencrypted realm file (and solve the security aspect differently) for a while to absolutely confirm that this is the issue.

xMickeymikex commented 4 years ago

@yohanan @sirivanleo Are you using external storage to hold your db files? As far as i know Android makes sure internal storage for each app has very restricted access - only available for an app. Unless the user roots the device or you enable "backup" for your app you can't really get access the data outside of a package. So there should not be a very need to have an encryption over DB. What do you think?

EdwardvanRaak commented 4 years ago

@yohanan

(1) how do you catch these corruption exceptions? since most if not all are coming from the JNI, they are difficult to catch and respond.

You pretty much can't.

We put a number of validating realm transactions and the realm init code around this wrapper:

@SuppressLint("ApplySharedPref")
fun <T> safeExecute(block: () -> T): T {
    sharedPreferences.edit().putInt(REALM_VALIDATION, FLAG_START).commit()
    val result = block()
    sharedPreferences.edit().putInt(REALM_VALIDATION, FLAG_END).commit()
    return result
}

Then on startup we delete the realm files if the FLAG_END was not hit. It's not pretty but it works for native crashes triggered during startup. At least sometimes our users can continue rather than uninstall the app. But the corruption sometimes happens on a single field, or during background syncs and then your pretty much screwed (unless you use a timer approach).

Also, cmelchior mentioned that reading the corrupt string field on javascript does not cause a crash but silently fails, so I'm wondering how javascript is able to do that.

theverybest commented 4 years ago

@EdwardvanRaak Yes, we use Realm with encryption enabled - that was one of the reasons for choosing Realm database.

bmunkholm commented 4 years ago

I just want to assure everyone that this issue is definitely something we are very concerned about. It's increasingly annoying that we have been unsuccessful in reproducing the issue, and hence get a chance to fix it. The issue can be within Realm, which is totally likely, but it could also be in Android or specific implementations or devices. The interesting part is that we don't see the same issue in iOS and JS, which could indicate this is related to the Android SDK alone and not the Core database. But the Android SDK doesn't do things that should make it possible to corrupt the database in any way, which may hint of the issue being in the platform or specific implementations of it. We know this doesn't help in any way, but please continue reporting this issue as it can help us to see patterns. When reporting it, please include all details. Thanks for the patience and you help!