JetBrains / lincheck

Framework for testing concurrent data structures
Mozilla Public License 2.0
582 stars 34 forks source link

Lincheck NoClassDefFoundError #376

Open bbrockbernd opened 1 month ago

bbrockbernd commented 1 month ago

Hi! When I used lincheck to test a (faulty) concurrent hash table i got the following error:

Invalid interleaving found, trying to minimize the scenario below:
| -------------------------------------- |
|  Thread 1   |  Thread 2  |  Thread 3   |
| -------------------------------------- |
| put(0, 0)   |            |             |
| -------------------------------------- |
| remove(0)   | put(1, 1)  | get(-1)     |
| put(-1, -1) | put(2, -1) | put(-1, -2) |
| get(2)      | get(-2)    | get(-1)     |
| -------------------------------------- |

OpenJDK 64-Bit Server VM warning: Sharing is only supported for boot loader classes because bootstrap classpath has been appended
java.lang.NoClassDefFoundError: Could not initialize class java.lang.StackTraceElement$HashedModules
Exception in thread "Test worker" java.lang.NoClassDefFoundError: Could not initialize class java.lang.StackTraceElement$HashedModules
Exception: java.lang.NoClassDefFoundError thrown from the UncaughtExceptionHandler in thread "Test worker"
> Task :test FAILED
FAILURE: Build failed with an exception.
* What went wrong:
Execution failed for task ':test'.
> Process 'Gradle Test Executor 21' finished with non-zero exit value 1
  This problem might be caused by incorrect test process configuration.
  For more on test execution, please refer to https://docs.gradle.org/8.8/userguide/java_testing.html#sec:test_execution in the Gradle documentation.
* Try:
> Run with --stacktrace option to get the stack trace.
> Run with --info or --debug option to get more log output.
> Run with --scan to get full insights.
> Get more help at https://help.gradle.org.
BUILD FAILED in 2s

I have the following hash table implementation:

package reproduce

import java.util.concurrent.atomic.*
import kotlin.math.absoluteValue

class ConcurrentHashTable<K : Any, V : Any>(initialCapacity: Int) {
    private val table = AtomicReference(Table<K, V>(initialCapacity))

    fun put(key: K, value: V): V? {
        while (true) {
            val currentTable = table.get()
            val putResult = currentTable.put(key, value)
            if (putResult === NEEDS_REHASH) {
                currentTable.resize()
                table.compareAndSet(currentTable, currentTable.nextTable.get())
            } else {
                return putResult as V?
            }
        }
    }

    fun get(key: K): V? {
        return table.get().get(key)
    }

    fun remove(key: K): V? {
        return table.get().remove(key)
    }

    class Table<K : Any, V : Any>(val capacity: Int) {
        data class Fixed(val value: Any)

        val keys = AtomicReferenceArray<Any?>(capacity)
        val values = AtomicReferenceArray<Any?>(capacity)
        val nextTable = AtomicReference<Table<K, V>?>(null)

        fun put(key: K, value: V): Any? {
            var index = index(key)
            repeat(MAX_PROBES) {
                while (true) {
                    val curKey = keys[index]
                    when (curKey) {
                        null -> {
                            if (!keys.compareAndSet(index, null, key)) continue
                            if (!values.compareAndSet(index, null, value)) continue
                            return null
                        }
                        key -> {
                            while (true) {
                                val currentValue = values[index]
                                if (currentValue == MOVED) return nextTable.get()!!.put(key, value)
                                if (currentValue is Fixed) {
                                    resize()
                                    continue
                                }
                                if (!values.compareAndSet(index, currentValue, key)) continue
                                return currentValue
                            }
                        }
                    }
                    index = (index + 1) % capacity
                    break
                }
            }
            return NEEDS_REHASH
        }

        fun get(key: K): V? {
            var index = index(key)
            repeat(MAX_PROBES) {
                val curKey = keys[index]
                when (curKey) {
                    key -> {
                        val curValue = values[index]
                        if (curValue == MOVED) return nextTable.get()!!.get(key)
                        if (curValue is Fixed) return curValue.value as V?
                        return values[index] as V?
                    }
                    null -> {
                        return null
                    }
                }
                index = (index + 1) % capacity
            }
            return null
        }

        fun remove(key: K): V? {
            var index = index(key)
            repeat(MAX_PROBES) {
                val curKey = keys[index]
                when (curKey) {
                    key -> {
                        while (true) {
                            val curValue = values[index]
                            if (curValue == MOVED) return nextTable.get()!!.remove(key)
                            if (curValue is Fixed) {
                                resize()
                                continue
                            }
                            if (!values.compareAndSet(index, curValue, null)) continue
                            return curValue as V?
                        }
                    }
                    null -> {
                        return null
                    }
                }
                index = (index + 1) % capacity
            }
            return null
        }

        fun resize() {
            nextTable.compareAndSet(null, Table<K, V>(capacity * 2))
            val next = nextTable.get()!!

            for (i in 0 ..< capacity) {
                while (true) {
                    val curKey = keys[i]
                    val curValue = values[i]

                    // if already moved continue
                    if (curKey == MOVED) break

                    // if empty or removed -> MOVED
                    if (curValue == null) {
                        if (!values.compareAndSet(i, null, MOVED)) continue
                        break
                    }

                    // if fixed -> put value and set MOVED
                    if (curValue is Fixed) {
                        // Set value in new table
                        next.put(curKey as K, curValue.value as V)

                        // Set value to moved in this table
                        if (!values.compareAndSet(i, curValue, MOVED)) continue
                        break
                    }

                    // if normal value -> fix
                    values.compareAndSet(i, curValue, Fixed(curValue))
                }
            }
        }
        private fun index(key: Any) = ((key.hashCode() * MAGIC) % capacity).absoluteValue
    }
}

private const val MAGIC = -0x61c88647 // golden ratio 
private const val MAX_PROBES = 2
private val NEEDS_REHASH = Any()
private val MOVED = Any()

And the following test implementation. NOTE: when I don't use inheritance here, the test works as expected and gives me the trace of the concurrent execution. The same holds for the sequential specification.

package reproduce

import org.jetbrains.kotlinx.lincheck.LoggingLevel
import org.jetbrains.kotlinx.lincheck.Options
import org.jetbrains.kotlinx.lincheck.annotations.Operation
import org.jetbrains.kotlinx.lincheck.check
import org.jetbrains.kotlinx.lincheck.strategy.managed.modelchecking.ModelCheckingOptions
import org.junit.FixMethodOrder
import org.junit.Test
import org.junit.runners.MethodSorters

@FixMethodOrder(MethodSorters.NAME_ASCENDING)
abstract class MyLinCheckTestBase {

    @Test
    fun modelCheckingTest() = ModelCheckingOptions()
        .iterations(300)
        .invocationsPerIteration(10_000)
        .actorsBefore(1)
        .threads(3)
        .actorsPerThread(3)
        .actorsAfter(0)
        .checkObstructionFreedom(true)
        .sequentialSpecification(SequentialHashTableIntInt::class.java)
        .logLevel(LoggingLevel.INFO)
        .apply { customConfiguration() }
        .check(this::class.java)

    fun Options<*, *>.customConfiguration() {}
}

class MyLinCheckTestImplementation : MyLinCheckTestBase() {
    private val map = ConcurrentHashTable<Int, Int>(2)
    @Operation
    fun put(k: Int, v: Int): Int? = map.put(k, v)

    @Operation
    fun get(k: Int): Int? = map.get(k)

    @Operation
    fun remove(k: Int): Int? = map.remove(k)
}

class SequentialHashTableIntInt {
    private val map = HashMap<Int, Int>()

    fun put(key: Int, value: Int): Int? = map.put(key, value)

    fun get(key: Int): Int? = map.get(key)

    fun remove(key: Int): Int? = map.remove(key)
}
eupp commented 1 month ago

Hi @bbrockbernd ! Thank you for the bug report!

We sometimes observe similar problems on CI (NoClassDefFoundError). We suspect it can be related to dynamic java agent attachment and class re-transformation process. But because the issue does not reproduce stably on CI, it is very hard to debug it.

Does the issue reproduces stably on your test, or is it a flaky test too?

bbrockbernd commented 1 month ago

Hi @eupp ! AFAIK, for me this specific test and DS consistently end up with this error message.

I have prepared a repo for you that works for me: https://github.com/bbrockbernd/lincheck-noclassdeffound

Some additional info: Project and gradle JVM are 17-corretto Gradle version 8.8 Lincheck version 2.34 Running on Mac M2 Max, MacOS 15.0 (24A335)

Let me know if you need anything else!

bbrockbernd commented 1 month ago

The same bug is now occurring (consistently) on a different project as well that I am testing with lincheck. If you still have troubles reproducing, I could provide some examples from that as well.

eupp commented 2 weeks ago

Hi @bbrockbernd !

I've investigated this issue. I have a quick (and a bit hacky) fix for this issue, which I am going to publish soon.

However, in general, the problem behind this is more serious, I described it here: #419

CC @ndkoval