objectbox / objectbox-java

Android Database - first and fast, lightweight on-device vector database
https://objectbox.io
Apache License 2.0
4.41k stars 302 forks source link

JNI ERROR (app bug): global reference table overflow on executing many instrumentation tests in a row #1080

Closed Faltenreich closed 2 years ago

Faltenreich commented 2 years ago

Describe the bug We have an app with around 50-60 instrumentation tests. Most of them use an underlying and temporary ObjectBox database. After a certain amount of tests in a row, we run reproducibly into a JNI Error. This error halts our test suite and therefore our whole CI pipeline. The error occurs on different tests but always after a certain number of instrumentation tests.

We assume that this JNI Error is caused by ObjectBox as we do not handle any native code and most of the classes from the log have something to do with this otherwise awesome library, e.g. entities or property converters. It seems that ObjectBox does hold some global references that are not being freed between tests.

We tried to setup/teardown the in-memory database before for every test and the whole test suite (via the CustomTestRunner down below), but this did not solve our problem.

Basic info:

To Reproduce Steps to reproduce the behavior:

  1. Have an Android project with around 50-60 instrumentation tests that use ObjectBox under the hood
  2. Run every test, one after another
  3. A certain amount of tests runs succesfully
  4. After a certain amount one test fails with the error below
  5. Test suite stops and remaining tests will not be executed

Expected behavior Less unique instances of ObjectBox-related classes and therefore no global reference table overflow anymore when running many instrumentation tests.

Code

// build.gradle

plugins {
    id('com.android.application')
    id('kotlin-android')
    id('kotlin-kapt')
    id("dagger.hilt.android.plugin")
    id('de.mannodermaus.android-junit5')
}

android {
    compileSdk 32

    defaultConfig {
        applicationId "de.thermomess.field"
        minSdk 21
        targetSdk 32
        versionCode 137
        versionName "5.0.0"
        testInstrumentationRunner "de.custom.CustomTestRunner"
    }

    testOptions {
        animationsDisabled = true
        unitTests {
            includeAndroidResources = true
            all {
                testLogging {
                    outputs.upToDateWhen { false }
                    events "passed", "failed", "standardError"
                    showCauses true
                    showExceptions true
                }
            }
        }
        packagingOptions {
            jniLibs {
                useLegacyPackaging true
            }
        }
    }

    [...]
}
// Test Runner which setups up and tears down in-memory database
class CustomTestRunner : AndroidJUnitRunner() {

    // Would normally be injected via dependency injection but left here to simplify code example
    var database: BoxStore? = null

    override fun onCreate(arguments: Bundle?) {
        super.onCreate(arguments)
        val directory = checkNotNull(InstrumentationRegistry.getInstrumentation().targetContext.getExternalFilesDir("objectbox-test/test-db"))
        database = MyObjectBox.builder().androidContext(InstrumentationRegistry.getInstrumentation().targetContext)
            .directory(directory)
            .maxSizeInKByte(5000)
            .maxReaders(1)
            .initialDbFile {
                BufferedInputStream(InstrumentationRegistry.getInstrumentation().targetContext.resources.openRawResource(R.raw.data))
            }
            .debugFlags(DebugFlags.LOG_QUERIES or DebugFlags.LOG_QUERY_PARAMETERS)
            .build()
    }

    override fun onDestroy() {
        super.onDestroy()
        database?.close()
        database?.deleteAllFiles()
        database = null
    }
}

Logs, stack traces

JNI ERROR (app bug): global reference table overflow (max=51200)global reference table dump: 
  Last 10 entries (of 51200): 
    51199: 0x13193a88 java.lang.Class<j$.time.LocalDateTime> 
    51198: 0x131939c8 java.lang.Class<de.custom.LocalDateTimePropertyConverter> 
    51197: 0x1336f9f8 de.custom.LocalDateTimePropertyConverter 
    51196: 0x13193a88 java.lang.Class<j$.time.LocalDateTime> 
    51195: 0x131939c8 java.lang.Class<de.custom.LocalDateTimePropertyConverter> 
    51194: 0x13197e90 java.lang.Class<de.custom.SomeObjectBoxEntity> 
    51193: 0x1336f990 de.custom.SomeEnumPropertyConverter 
    51192: 0x13197d80 java.lang.Class<de.custom.SomeEnum> 
    51191: 0x131941c8 java.lang.Class<de.custom.SomeEnumPropertyConverter> 
    51190: 0x1336f988 de.custom.SomeEnumPropertyConverter
  Summary: 
    34560 of java.lang.Class (450 unique instances) 
    7680 of de.custom.LocalDateTimePropertyConverter (7680 unique instances) 
    7478 of de.custom.SomeEnumPropertyConverter (7478 unique instances) 
    288 of java.nio.DirectByteBuffer (288 unique instances) 272 of io.objectbox.BoxStore (52 unique instances) 
    208 of de.custom.SomeOtherEnumPropertyConverter (208 unique instances) 
    208 of de.custom.SomeOtherOtherEnumPropertyConverter (208 unique instances) 
    105 of de.custom.SomeOtherOtherOtherEnumPropertyConverter (105 unique instances) 
    104 of de.custom.SomeOtherOtherOtherOtherEnumPropertyConverter (104 unique instances) 
    52 of de.custom.SomeOtherOtherOtherOtherOtherEnumPropertyConverter (52 unique instances) 
    52 of de.custom.SomeOtherOtherOtherOtherOtherOtherEnumPropertyConverter (52 unique instances) 
    52 of de.custom.SomeOtherOtherOtherOtherOtherOtherOtherEnumPropertyConverter (52 unique instances) 
    20 of android.app.assist.AssistStructure$SendChannel (20 unique instances) 
    17 of android.view.autofill.AutofillManager$AutofillManagerClient (17 unique instances) 
    15 of android.view.ViewRootImpl$W (15 unique instances) 15 of android.view.accessibility.AccessibilityManager$1 (15 unique instances)

Additional context Our current workaround is to run our instrumentation tests in chunks, so that the JNI Error does not appear after a certain amount of tests.

Thanks in advance and best regards!

greenrobot-team commented 2 years ago

Thanks for this detailed report! I quickly noticed that many classes called converter are in the reference table dump. Do your @Entity classes use @Convert? Can you share an example entity and converter implementation?

Faltenreich commented 2 years ago

Thank you for your quick response! Yes, our classes use PropertyConverters, e.g. one for converting java.time.LocalDateTime into Long. See our sample implementation:

@Entity
data class SampleEntity(
    @Index
    var id: Long? = null,
    @Convert(converter = LocalDateTimePropertyConverter::class, dbType = Long::class)
    var localDateTime: LocalDateTime? = null,
    @Convert(converter = SomeEnumPropertyConverter::class, dbType = String::class)
    var someEnum: SomeEnum? = null,
) {
    // Fixes relation for unit tests: https://docs.objectbox.io/relations#initialization-magic
    @JvmField
    @Transient
    @kotlin.jvm.Transient
    @Suppress("PropertyName")
    var __boxStore: BoxStore? = null
}
class LocalDateTimePropertyConverter : PropertyConverter<LocalDateTime, Long> {

    override fun convertToEntityProperty(databaseValue: Long?): LocalDateTime? {
        databaseValue ?: return null 
        return LocalDateTime.ofEpochSecond(databaseValue, 0, ZoneOffset.UTC)
    }

    override fun convertToDatabaseValue(entityValue: LocalDateTime?): Long? {
        return entityValue?.toEpochSecond(ZoneOffset.UTC)
    }
}
Faltenreich commented 2 years ago

@greenrobot-team Is the info I provided enough to replicate this problem or do you need more to fully investigate?

greenrobot-team commented 2 years ago

@Faltenreich Yes, thank you. I was on holiday and currently some other tasks have priority, but will look into this!

greenrobot-team commented 2 years ago

Just had a quick look: from my initial testing it appears that onDestroy of the CustomTestRunner is never called. According to documentation it's only "Called when the instrumented application is stopping, after all of the normal application cleanup has occurred.".

The crash log also indicates there are 52 unique instances of BoxStore. If you really run each of your 50-60 instrumented tests separately as you say, it's possible a new store is created for each and never cleaned up in between?

Edit: not sure this is right. The test app is killed after each test. How do you run your tests? Using Gradle tasks or adb commands?

Maybe try to create/destroy BoxStore using @Before/@After or @BeforeClass/@AfterClass functions instead of in a custom test runner?

Faltenreich commented 2 years ago

Sadly we have the same outcome if we create and destroy the BoxStore before and after every test via an IntentsTestRule:

class CustomTestRule : IntentsTestRule<T>(MainActivity::class.java) {

    // Would normally be injected via dependency injection but left here to simplify code example
    var database: BoxStore? = null

    private lateinit var databaseDirectory: File

    override fun beforeActivityLaunched() {
        databaseDirectory = checkNotNull(InstrumentationRegistry.getInstrumentation().targetContext.getExternalFilesDir("objectbox-test/test-db"))
        BoxStore.deleteAllFiles(databaseDirectory)
        database = MyObjectBox.builder().androidContext(InstrumentationRegistry.getInstrumentation().targetContext)
            .directory(databaseDirectory)
            .maxSizeInKByte(5000)
            .maxReaders(1)
            .initialDbFile {
                BufferedInputStream(InstrumentationRegistry.getInstrumentation().targetContext.resources.openRawResource(R.raw.data))
            }
            .debugFlags(DebugFlags.LOG_QUERIES or DebugFlags.LOG_QUERY_PARAMETERS)
            .build()
    }

    override fun afterActivityFinished() {
        database.close()
        BoxStore.deleteAllFiles(databaseDirectory)
    }
}

We run our tests via Gradle tasks, e.g. Task :app:connectedBetaDebugAndroidTest.

greenrobot-team commented 2 years ago

Thanks! We had a closer look as well and could verify the memory leak. The converter classes are indeed not freed up after closing the store.

We'll update this issue when a fix is ready to be shipped.

greenrobot-team commented 2 years ago

Release 3.2.1 which fixes this issue is now available.

Faltenreich commented 2 years ago

Version 3.2.1 fixes our test suite and all instrumentation tests are subsequently succeeding again!

Thank you very much for your effort and keep up the good work! <3