Closed camsaul closed 1 year ago
The assumption is that you should never reuse a Connection
across multiple threads -- it simply isn't safe because Connection
objects are mutable, and any code that changes, e.g., auto-commit, is going to affect (i.e., interfere with) any other threads using the same Connection
.
Right, we weren't trying to do anything kooky like pass that Connection around and run 10 different queries on 10 different threads with it or anything like that, we just had some code that ran something on a different thread only to enforce a timeout for something. If we could do that on the same thread we would, but I don't know of any way to do that. But the Connection is still being used the same way as if it was all on one Thread
The repro code passes the Connection
to another thread and uses it there -- which is unsafe.
Can you provide repro code that exhibits the deadlock that does not involve accessing the (locked) Connection
on another thread?
No, as far as I know this is only an issue when you use the Connection in another thread.
I agree that sharing Connections across threads generally is generally not a good idea, but that's not exactly the same as an absolute rule. Example from the Oracle that says not to do it but then tells you what to do if you really must do it:
However, Oracle strongly discourages sharing a database connection among multiple threads. Avoid allowing multiple threads to access a connection simultaneously. If multiple threads must share a connection, use a disciplined begin-using/end-using protocol.
I don't think our use case is itself inheritably unsafe. These queries are happening serially, at no point are we ever running multiple queries at the same time. The fact that a separate thread is involved is really just an implementation detail, and the code is still executed in exactly the same order as if it was all done on a single thread. So it doesn't matter if different parts are running around changing .readOnly
or the transaction isolation level, or anything like that. We're comfortable with Connection
being mutable and are not doing anything where that's an issue.
Either way, you're the library author, so if you don't want to support people doing things they generally shouldn't be doing, that's understandable. I just want to point out that there is no way for someone to work around this issue other than very icky hacks, and the error situation here is not obvious at all and extremely hard to work out.
If you're just running queries why use with-transaction
? If you don't use a transaction here, there won't be a lock on the Connection
and then it's "caveat programmer". I'd have to dig quite a long way back in next.jdbc
's history to provide the backstory on why it locks -- but it was doing it back before the nested TX stuff was added...
That was easier than I expected: https://github.com/seancorfield/next-jdbc/issues/106 -- April, 2020.
Well, the query is getting triggered after we insert something something, but the way we have things set up it's all done inside of a transaction. Basically, after inserting Object X, then do Y. It probably doesn't have to be done this way in this case honestly.
Either way, I still think there are some real cases where you'd want to do something like this in real life.
Example: running tests inside of a transaction: open a transaction, insert a lot of rows, do something that fetches those rows and check some condition, then roll back the transaction.
Example: running tests inside of a transaction: open a transaction, insert a lot of rows, do something that fetches those rows and check some condition, then roll back the transaction.
Not on multiple threads. Not going to allow that. Going to close this now.
Ehh, I think there are legitimate reasons another thread might have to be involved in the tests.
Now that I think about it, I do have a real use case outside of tests:
Now it would make sense to do that all inside of a transaction I think. Maybe you switch steps 1 and 2 around, but that wouldn't be as natural as fetching the data when you actually need it. Doesn't seem like it would lead to particularly elegant code if you have to go out of your way to avoid doing any SELECT
s inside of a transaction.
But I don't think the real question is whether mixing DML and DQL inside a single transaction is ever warranted. The question is whether you might ever legitimately need to run something on a separate thread, and I think the answer is a yes.
But I accept your decision, we'll figure out some way to work around it.
Would you be open to a PR to add this to the differences from clojure.java.jdbc
documentation? I'm not sure we would have made the switch before having a workaround in mind if we had known about the implications
I might be open to a PR to not do the locking
in some scenarios based on the binding of *nested-tx*
so there was a way of opting out based on, essentially, opting in to known, unsafe nested TX behavior.
But, basically, this was added specifically -- per #106 -- to avoid a situation that c.j.j happened to allow that led to even harder-to-debug wrong behavior. So, yes, next.jdbc
might prevent some legitimately "OK" stuff but it no longer allows the accidentally wrong-but-very-hard-to-figure-out stuff.
For example, I might consider not locking if *nested-tx*
is bound to :ignore
which was the c.j.j behavior... Would that satisfy you here?
Sure, I'd be happy with if we had any workaround at all. A separate option called :unsafe-you-will-regret-this
works for us too 😄
I wasn't suggesting you remove it entirely anyway. I'd like to keep benefits of locking
there in most cases, just with an escape hatch to disable it in the edge cases where we have a legitimate reason for not wanting it.
Happy to open a PR if you let me know how you'd like it to be done
Root problem is
next.jdbc
is doinghere
https://github.com/seancorfield/next-jdbc/blob/dc240652fc5fda30fe6e289cc2a351ebfc008fd7/src/next/jdbc/transaction.clj#L119
c3p0
com.mchange.v2.c3p0.impl.NewProxyConnection
(the underlying proxy class returned by c3p0) basically tries to lock on every single methode.g.
https://github.com/sluk3r/c3p0/blob/df4ab37ee2fc2894e5d2bf44b3973ab7b87508bf/src/main/java/com/mchange/v2/c3p0/impl/NewProxyConnection.java#L42
and if this results in instant deadlocks if you run something from a different thread.
This definitely happens if you use
with-transaction
in combination with aConnection
, because that's what gets locked; I haven't personally ran into issues usingwith-transaction
with aDataSource
, but I think you would run into the same problem with a DataSource implementation with asynchronized
getConnection
method.I ran into this today when trying to figure out why our application started suddenly taking 6 minutes to launch, after digging most of the day I figured out that we had some code somewhere that was basically doing
and it was getting triggered inside of a transaction which lead to it hanging for 5 seconds, due to some other unrelated bugs it kept trying this over and over and disaster ensued.
To repro
I don't think c3p0 is doing anything that I would consider wrong here. It seems like any JDBC implementation could have synchronized methods in their concrete
java.sql.Connection
classes, which would trigger the same error, even with no connection pooling involved. I think it's risky to lock on anyjava.sql.Connection
. Not sure locking on aDataSource
is much better, I wouldn't be surprised if every single method in c3p0's DataSource implementations aresynchronized
as well, but that wasn't causing us any problems since we have our own DataSource impl that is less fussy.I'm not really 100% sure I understand the purpose of the lock, otherwise I'd offer to submit a PR.
Also I would like to mention that we are considering switching from c3p0 to HikariCP at some point 😄