Open paulaustin-automutatio opened 6 days ago
Fair enough, but I'm curious why this would be blocked ?
Dave,
Unfortunately I couldn't find an answer to that question. There didn't seem to be anything else that was using the postgresql classes at the time . I have code which can dump all the stack traces for all threads, including virtual threads that I create (normally the JVM doesn't list virtual threads via the MX Bean).
I looked through that class to see where you were using the lock and as far as I can tell it was doing so using a try with resources block so it should be releasing the locks correctly.
Paul
seems bizarre, we didn't think the lock would really effect anything like that. I'm OK removing them but somethings amiss here.
Is a single connection used by multiple threads?
At any one given time the connection should only be used by a single thread. However that connection is in a connection pool, and over time it might be used by different threads. Which shouldn't be using it in 2 threads concurrently.
It might be that the overhead of ReentrantLock
is much higher with the virtual threads.
It looks like we benchmarked regular threads only.
See https://github.com/pgjdbc/pgjdbc/issues/1951#issuecomment-1253521413
@rbygrave, did you benchmark virtual threads + ReentrantLock
?
By the way, there's https://openjdk.org/jeps/491, we synchronize
would no longer pin the thread, so we might want reverting back to simple synchronization.
It might be that the overhead of
ReentrantLock
is much higher with the virtual threads. It looks like we benchmarked regular threads only.
That wouldn't cause it to block though. I presume you are just suggesting we may want to just use synchronized
instead
I would stick with the ReentrantLock, but check to see if there is a requirement for a lock to occur at all. In this case it shouldn't be on getting the value. I'd also confirm why the lock would be needed for setting the values. If it it so the values don't change while you were performing a task I would get those values at the begining of the task as variables and use those.
I did notice some code is repeatedly calling getStandardConformingStrings in the same method. So you have the overhead of that lock multiple times. For example in org.postgresql.core.CachedQueryCreateAction.create(Object). Put all of those into variables at the start of a method to avoid that extra overhead.
I can put this in a separate ticket. But you might want to look at java.util.concurrent.ConcurrentHashMap<K, V> and the computeIfAbsent method for LruCache. I've had good success using that instead of locking for managing a cache of values.
jep 491 won't be around until JDK 24, so I'd avoid synchronized for now
I presume you are just suggesting we may want to just use synchronized instead
Based on #1951 benchmarks for the regular threads we concluded that the difference between synchronized
and ReentrantLock
was negligible, so we went ahead and replaced all the synchronized
with ReentrantLock
to be sure there's no virtual thread pinning.
If it turns out the overhead is significant, then we could reiterate. For the simple cases like get/set, we could use synchronized
even now. The problematic case for the virtual threads is when we call IO while holding synchronized
monitor. As long as we are not doing IO, we could use synchronized
just fine.
However, before we make any replacements, I would like to get some benchmarks to quantify the change.
But you might want to look at java.util.concurrent.ConcurrentHashMap<K, V> and the computeIfAbsent method for LruCache
Currently, LruCache
is not shared across threads, and different threads should avoid the concurrent use of the same Connection
. So I do not see a reason to use ConcurrentHashMap
there. There should be absolutely no contention on LruCache
right now.
But you might want to look at java.util.concurrent.ConcurrentHashMap<K, V> and the computeIfAbsent method for LruCache
Currently,
LruCache
is not shared across threads, and different threads should avoid the concurrent use of the sameConnection
. So I do not see a reason to useConcurrentHashMap
there. There should be absolutely no contention onLruCache
right now.
The only reason I brought it up was that the LruCache is using a ReentrantLock.
I'm submitting a ...
Describe the issue
I have a process that is hung in the QueryExecutorBase for 12+ hours.
In the org.postgresql.core.QueryExecutorBase it is using reentrant locks to protect fields and operations. However it is also locking simple get operations that are atomic by definition. Is there some reason that this is needed? If a caller (e.g. borrowObject) needs to make sure it is using the same values for those fields then it should do the lock, not the method to get the value.
For example
Should be
This applies to all of the get methods that just return values.
Driver Version?
4.7.2
Java Version?
21.0.4
OS Version?
N/A
PostgreSQL Version?
16
To Reproduce Steps to reproduce the behaviour:
Unable to directly reproduce as it is a locking race condition.
Expected behaviour
Code to not block
Logs
Here is a stack trace that is blocked. There are no other threads currently using an postgresql classes.