swaldman / c3p0

a mature, highly concurrent JDBC Connection pooling library, with support for caching and reuse of PreparedStatements.
http://www.mchange.com/projects/c3p0
Other
1.28k stars 336 forks source link

Another kind of virtual thread pinning on c3p0 getConnection with java 21 (loom) #179

Open alex-kormukhin opened 2 months ago

alex-kormukhin commented 2 months ago

c3p0-loom v. 0.10.1

Thank you for #174, that kind of pinning is gone. But now we have pinning on synchronized wait/notify in BasicResourcePool.prelimCheckoutResource on awaitAvailable method. Unfortunatelly, that kind of pinning does not reported by java -Djdk.tracePinnedThreads=full.

If we have more than 256 (jdk.virtualThreadScheduler.maxPoolSize) virtual threads that execute some jdbc query, than c3p0.maxPoolSize threads parked on DB network call and unmounted from carrier threads (thats good), but other virtual threads pinned to carrier thread with that stack and can't be unmounted: java.base/java.lang.Object.wait0(Native Method) java.base/java.lang.Object.wait(Object.java:366) com.mchange.v2.resourcepool.BasicResourcePool.awaitAvailable(BasicResourcePool.java:1440) com.mchange.v2.resourcepool.BasicResourcePool.prelimCheckoutResource(BasicResourcePool.java:580) com.mchange.v2.resourcepool.BasicResourcePool.checkoutResource(BasicResourcePool.java:490) com.mchange.v2.c3p0.impl.C3P0PooledConnectionPool.checkoutAndScacheMarkConnectionInUse(C3P0PooledConnectionPool.java:965) com.mchange.v2.c3p0.impl.C3P0PooledConnectionPool.checkoutPooledConnection(C3P0PooledConnectionPool.java:893) com.mchange.v2.c3p0.impl.AbstractPoolBackedDataSource.getConnection(AbstractPoolBackedDataSource.java:105)

All carrier threads blocked on Continuation run for such threads: #1339 "ForkJoinPool-1-worker-248" java.base/jdk.internal.vm.Continuation.run(Continuation.java:251) java.base/java.lang.VirtualThread.runContinuation(VirtualThread.java:221) java.base/java.util.concurrent.ForkJoinTask$RunnableExecuteAction.exec(ForkJoinTask.java:1423) java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:387) java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1312) java.base/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1843) java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1808) java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:188)

And when first good threads (<= c3p0.maxPoolSize) receive DB network response they can't be scheduled on carrier threads and can't finish his jobs and return connection to pool.

Some kind of deadlock is happens.

You can test that with this demo-app.zip: demo-app.zip

Run mvn test.

First test "testConnectionPoolWorksInMultiplePlatformThreads()" with platform threads works ok. Second test "testConnectionPoolWorksInMultipleVirtualThreads()" with virtual threads hangs and failed by timeout: com.example.demo.ConnectionPoolTest.testConnectionPoolWorksInMultipleVirtualThreads -- Time elapsed: 20.02 s <<< ERROR! java.util.concurrent.TimeoutException: testConnectionPoolWorksInMultipleVirtualThreads() timed out after 20 seconds

While test hangs you can see stacks by command: jcmd <PID> Thread.dump_to_file -format=text td-01.txt

Stack traces looks like this: td-01.txt

Current virtual thread implementation is not perfect (

Synchronized wait/notify cause virtual thread pinning. ReentrantLock with Condition can resolve such issues until JDK loom developers fix that imperfection: ReentrantLock lock = new ReentrantLock(); Condition lockCondition = lock.newCondition(); ... lock.lock(); try { lockCondition.await(); } finally { lock.unlock(); } ... lock.lock(); try { lockCondition.signalAll(); } finally { lock.unlock(); }

swaldman commented 2 months ago

Grrr... I was so happy that the care I took to never hit a blocking operation except wait() while holding a lock meant that c3p0 was mostly loom-ready out of the box. There were just those few issues in the prooxies I hadn't thought about. This issue will require switching the main BasicResourcePool and the Statement cache, both of which rely on wait/notifyAll semantics, to ReentrantLock.

Grrr.

(But thank you for the carefully-explained report!)