swaldman / c3p0

a mature, highly concurrent JDBC Connection pooling library, with support for caching and reuse of PreparedStatements.
http://www.mchange.com/projects/c3p0
Other
1.29k stars 339 forks source link

Force kill acquires #91

Closed rscadrde closed 5 years ago

rscadrde commented 7 years ago

90

swaldman commented 7 years ago

hi,

thanks.

did you observe the potential issue here in practice, or review the code and notice the possibility in theory that the reversion of force_kill_acquires would be skipped?

thanks again.

rscadrde commented 7 years ago

(sorry for the autocrlf with windows eclipse did not work).

With mule runtime we hat that issue in production that client (mule with c3p0) did run but Database was down for maintenance. During that time somehow all force_kill_acquires went to "true" and periodically

com.mchange.v2.resourcepool.ResourcePoolException: A ResourcePool cannot acquire a new resource -- the factory or source appears to be down.
    at com.mchange.v2.resourcepool.BasicResourcePool.awaitAvailable(BasicResourcePool.java:1429)
    at com.mchange.v2.resourcepool.BasicResourcePool.prelimCheckoutResource(BasicResourcePool.java:644)
    at com.mchange.v2.resourcepool.BasicResourcePool.checkoutResource(BasicResourcePool.java:554)
    at com.mchange.v2.c3p0.impl.C3P0PooledConnectionPool.checkoutAndMarkConnectionInUse(C3P0PooledConnectionPool.java:758)
    at com.mchange.v2.c3p0.impl.C3P0PooledConnectionPool.checkoutPooledConnection(C3P0PooledConnectionPool.java:685)
    at com.mchange.v2.c3p0.impl.AbstractPoolBackedDataSource.getConnection(AbstractPoolBackedDataSource.java:140)
    at org.mule.module.db.internal.domain.connection.SimpleConnectionFactory.doCreateConnection(SimpleConnectionFactory.java:26)

appeared.

Solution was to restart the client.

(Mule Case Number 00164722 (not accessible for public))

First exceptions you can find in FirstExceptionWas.txt

rscadrde commented 7 years ago

Hi swaldman, case happened again.

Do you need some more informations to fix the try/finally issue ?

swaldman commented 7 years ago

hi @rscadrde,

i'm very sorry to have let this go by. i've been sick the last few weeks, i'll pretend that's an excuse.

anyway, a few things. broadly, i want to accept this. but...

annoyances:

  1. the cr-lf issue makes this too ugly to just review and merge. i'm perhaps overparanoid, but i don't want to merge something this sprawling. i've just deleted trailing whitespace from BasicResourcePool.java and pushed that, which creates a clean base to work from.

  2. before i can merge a substantive contribution, i need you to consent to an informal CLA. basically, I have to ask your permission too 1) include and redistribute this under the existing choice of licenses; and 2) relicense the contribution on alternative terms as I see fit.

i don't love asking for (2), but not having that permission was a hassle when the library migrated from LGPL alone to choice of LGPL or EPL. In general, any relicensing would probably be to some alternative open source license, but i also use these libraries in consulting work and it's not inconceivable, although it has never actually happened, that some client would ask for a nonexclusive license on custom terms.

substantively:

  1. you move force_kill_acquires = true; into try-with-resources assignment position. i'd probably prefer not to do that, as the assignment cannot be automatically reverted. (indeed, the main change is to place that explicit reversion into the finally clause.)

  2. i think we should add some logging of the hopefully very unusual event that might precipitate an attempt to forceKillAcquires() to fail. Presumably, I suspect the cause is some other entity calling interrupt() on c3p0's helper threads. in that case, we should probably add a catch clause that captures the InterruptedException and forwards the interrupt to remaining acquireWaiters on a best-efforts basis before rethrowing the InterruptedException. This will amount to a best-effort attempt to leave the world in the intended state (no more waiters) without preventing the third-party interrupt from having its effect. In all cases, we should emit a warning that includes information about the Exception that caused us to abort and whether waiting clients appear to remain.

that sounds like a lot, but in code terms it isn't so much. if you'd like to do it, i'll wait for a revised PR. if you'd prefer that I make these changes, let me know.

thank you again, and i am very sorry for (all too typically) flaking.

rscadrde commented 7 years ago

Place reversion into the finally would be sufficient for me.

Catching / logging / rethrowing InterruptedException sounds good as add-on.

"consent to an informal CLA" => I do.

According to cr-lf issue it would be better to reject my changes and re-push it from your branch yourself.

(In case you prefer me to do the change I would again try to make a clone/commit with the correct "cr".)

swaldman commented 5 years ago

(resolved i think in 0.9.5.4.)