kazu2012 / persevere-framework

Automatically exported from code.google.com/p/persevere-framework
0 stars 0 forks source link

Database Table Data Sources leak connections to the external database #220

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
I am working on an application with a number of database table data sources 
that connect to an 
external MySQL database.  After a short amount of time using the application, 
Persevere becomes 
unable to make connections to the external database because the database has 
reached its 
connection limit.

From a quick glance at the code, it appears that every request (or many 
requests) to Persevere is 
handled in a new thread, and Persevere opens a new connection to the database 
for every data 
source for every thread (even though all of my data sources are connecting to 
the same database 
with the same connection information).  There does not appear to be any 
connection sharing or 
pooling beyond this thread-level sharing, and the connections never appear to 
be closed.

What steps will reproduce the problem?
1.  Create a configuration with a reasonable number of database table data 
sources (10 or more?
) pointing at an external database.
2. Perform a set of requests from a client to persevere that will in turn 
request data from all of 
these sources  
3. Repeat step 2 until the database runs out of connections

What is the expected output? What do you see instead?
Expected output is that the application continues to work and return all the 
requested data.  
Actual output is that after some time, the database runs out of connections, so 
persevere returns 
errors in its content.

What version of the product are you using? On what operating system?
1.0b6 on Mac OS and Windows

Original issue reported on code.google.com by dsta...@gmail.com on 24 Jul 2009 at 7:17

GoogleCodeExporter commented 8 years ago
Connections can't be shared between threads, can they? I believe we need a 
connection 
for each thread. It is true that it would be better if there was connection 
sharing 
between data sources, although there should be bounded number of data sources. 
Does the 
number of threads times number of data sources exceed your max number of 
connections?

Also, it looks like there was an issue with connections repeatedly being 
created when a 
connection expired for a different table. I fixed that problem, so that should 
make a 
difference.

Original comment by kris...@gmail.com on 25 Jul 2009 at 3:54

GoogleCodeExporter commented 8 years ago
Your changes slightly improved the situation, but they don't resolve it.  My 
database still runs out of 
connections eventually.  The number of threads times the number of datasources 
exceeds the max number of 
connections by a very large margin.  In this example, I'm only really using two 
or three concurrent connections 
right now, but Persevere is creating (or would create) hundreds. Also, the 
database is shared between a 
number of systems, so I can't change its configuration easily.

It is true that connections cannot be used by two threads at the same time, but 
the typical way to resolve this 
is to use a connection pool like Apache's DBCP (based on commons-pool) which 
maintains a set of available 
connections, and ensures that each is only used by one thread at a time.  The 
pool acts as a proxy for your 
data source and returns wrapped connections that act just like ordinary 
connections except that when you 
close them, they go back into the pool instead of being actually closed.

I have reviewed the code with one of my coworkers, and I think we know what it 
would take to fix this, making 
use of Apache's DBCP connection pool.  Would you like us to work on a patch?

Original comment by dsta...@gmail.com on 29 Jul 2009 at 10:59

GoogleCodeExporter commented 8 years ago
That would be fantastic if you could provide a patch. Have you signed a CLA 
yet? (We 
have to a CLA for all contributions to ensure we can maintain a safe liberal 
license). 
Anyway, a patch would certainly be appreciated.

Original comment by kris...@gmail.com on 30 Jul 2009 at 12:09

GoogleCodeExporter commented 8 years ago
Ok, we are working on it.  It looks like my information above was a little out 
of date and there may be some 
better ways to handle connection pooling in recent versions of Java, but the 
general idea is the same.

We have not yet signed a CLA, but I don't think it will be a problem.  Where 
can I find it, and who do I need to 
send it to?

Original comment by dsta...@gmail.com on 3 Aug 2009 at 10:45

GoogleCodeExporter commented 8 years ago
Here is the CLA info:
http://dojofoundation.org/cla/

Original comment by kris...@gmail.com on 3 Aug 2009 at 10:55

GoogleCodeExporter commented 8 years ago
I just sent in a signed CLA for my employer, Quantivo, so we should be good to 
proceed.

Regarding the patch, here is what we did:
First, we set up the Jetty environment to allow JNDI configuration.  This 
requires adding two libraries that are 
part of the Jetty distribution (jetty-plus and jetty-naming), and minor changes 
to the launching scripts to add 
these libraries to the classpath.

Second, we added two new classes to implement the Persevere side of the data 
source 
(PooledDatabaseDataSource and PooledDatabaseTableDataSource).  These are based 
on the existing similarly-
named classes, except they always retrieve their connections from the JNDI data 
source, and they always close 
their connections when they are done with them.

Strictly speaking, these classes are poorly named, since they don't actually do 
connection pooling themselves, 
they just use the connections handed to them by the JNDI data source.  (A 
better name would be 
JndiDatabaseTableDataSource...). However, it is very easy to set up a pooled 
data source through the JNDI 
configuration.

To do this, you just need to add a few things to your application's WEB-INF 
directory-- the JAR file for a 
library that implements connection pooling (such as C3P0 or Apache's DBCP), and 
a jetty-web.xml file that 
sets up the JNDI data source.

To use the data source, supply the JNDI name of your data source when you 
configure your class.

I just want to point out that, because the new code does not attempt to hold on 
to a connection longer than it 
needs it, if you configure a data source that does not implement pooling, this 
implementation will probably 
perform much worse than the original implementation, since it will create and 
tear down a connection for 
every request to the database.  However, if you do use a pooling data source, 
the performance and scalability 
characteristics appear to be much better than before.  Our application is 
running considerably faster now and 
we are not exhausting the connection limits of our MySQL instance.

I am attaching the two new classes, as well as a sample jetty-web.xml file that 
shows how to set up a pooling 
data source using C3P0.

Original comment by dsta...@gmail.com on 25 Aug 2009 at 6:51

Attachments:

GoogleCodeExporter commented 8 years ago
This is great. Do you recommend keeping these separate from DatabaseDataSource 
and 
DatabaseTableDataSource, or can we have the pooled versions replace the 
existing code? 
If we did that, I presume we would need some mechanism for programmatically 
instantiating a pool ourselves, if there isn't one available from JNDI. Would 
that be 
possible? I presume we would also need to include DBCP or C3PO. Is there any 
significant advantages to either one? Apache's DBCP is much smaller (only 
118K), and I 
think I like the Apache license better, so it would work well for including 
with our 
distribution.

Original comment by kris...@gmail.com on 29 Aug 2009 at 3:08

GoogleCodeExporter commented 8 years ago
I do recommend phasing out the old code, because it is pretty much guaranteed 
to break a default installation 
of MySQL if you run it long enough under high load.  We implemented them 
separately so you could decide 
how you wanted to integrate the new code.

These new versions should not break any existing code, but data sources that 
were not using a JNDI 
configured pooling data source would probably suffer a performance penalty for 
creating a new connection 
for every statement.  One way to eliminate this performance penalty would be to 
automatically create a pool 
around manually configured resources.

To do this, you would need to create and manage your own registry of data 
sources (keyed off of the 
connection parameters), and then programmatically create the pool when the data 
source is configured for 
the first time.  This would mean including one of the pool implementations 
(DBCP or C3P0).  Then, for these 
manually configured data sources, you would just look them up in your own 
registry instead of looking them 
up from JNDI.

I don't think this would be too hard to add. We purposely didn't take this 
approach to start because we didn't 
want to force you to adopt any particular library, and because there may be a 
large number of pool 
configuration options that you would then want to be able to pass through to 
this implementation.  However, 
since the core of the new code is now correct for a pooling implementation, 
these changes basically amount 
to some additional alternate setup code and an alternate lookup for getting the 
DataSource that you will use 
to retrieve your connection.

We chose to use C3P0 because it looks like it is more actively being 
maintained.  I'm not sure if there are any 
material differences between them otherwise.

Original comment by dsta...@gmail.com on 31 Aug 2009 at 9:06

GoogleCodeExporter commented 8 years ago
Can you also provide a sample WEB-INF/config/sample.json for this new class?

Original comment by beau.cro...@gmail.com on 1 Sep 2009 at 2:41

GoogleCodeExporter commented 8 years ago
I added the above mentioned Java files to my svn checkout.

Here is my jetty-web.xml:
<Configure class="org.mortbay.jetty.webapp.WebAppContext">
        <New id="jdbc/DATASOURCENAME" class="org.mortbay.jetty.plus.naming.Resource">
        <Arg>java:comp/env/jdbc/DATASOURCENAME</Arg>
        <Arg>
                <New class="com.mchange.v2.c3p0.ComboPooledDataSource">
                                <Set
name="driverClass">com.mysql.jdbc.jdbc2.optional.MysqlConnectionPoolDataSource</
Set>
                        <Set name="jdbcUrl">jdbc:mysql://localhost:3306/itil_models</Set>
                                <Set name="user">root</Set>
                                <Set name="password">root</Set>
                        </New>
        </Arg>
        </New>
</Configure>

and config/p.json:
({
    "id": "p.json",
    "sources": [
    {
        "name": "tenancy_address",
        "sourceClass": "org.persvr.datasource.PooledDatabaseTableDataSource",
        "table": "tenancy_address",
        "idColumn": "id",
        "datasourceRef" : "java:comp/env/jdbc/DATASOURCENAME",
        "camelCaseColumnNames": false,
        "dataColumns": ["street_name", "street_dir", "city", "county", "state",
"country", "postal_code"],
        "schema": {
            "prototype": {},
            "data": {
                "$ref": "../tenancy_address/"
            }
        }
    },
    ]
})

I'll admit, I'm not particularly up to speed on c3p0 or dbcp, so a minor error 
is
entirely possible.

And this is what happens on creation:
java.lang.NullPointerException
        at
org.persvr.datasource.PooledDatabaseDataSource.abortTransaction(PooledDatabaseDa
taSource.java:270)
        at org.persvr.data.Transaction.commit(Transaction.java:289)
        at org.persvr.util.Console.evaluate(Console.java:134)
        at org.persvr.util.Console$1.run(Console.java:113)
        at org.persvr.security.UserSecurity.doPriviledgedAction(UserSecurity.java:50)
        at org.persvr.util.Console.run(Console.java:54)
java.lang.RuntimeException: java.lang.NullPointerException
        at org.persvr.data.Transaction.commit(Transaction.java:295)
        at org.persvr.util.Console.evaluate(Console.java:134)
        at org.persvr.util.Console$1.run(Console.java:113)
        at org.persvr.security.UserSecurity.doPriviledgedAction(UserSecurity.java:50)
        at org.persvr.util.Console.run(Console.java:54)
Caused by: java.lang.NullPointerException
        at
org.persvr.datasource.PooledDatabaseDataSource.startTransaction(PooledDatabaseDa
taSource.java:261)
        at org.persvr.data.Transaction.addAffectedSource(Transaction.java:100)
        at org.persvr.data.Transaction.commit(Transaction.java:227)
        ... 4 more

Original comment by beau.cro...@gmail.com on 1 Sep 2009 at 3:03

GoogleCodeExporter commented 8 years ago
I think this is our mistake.  I'm looking into it.  Sorry.  

Just for clarification, this only happens when you try to write back to the 
database.  Reading works, right?

Original comment by dsta...@gmail.com on 1 Sep 2009 at 7:34

GoogleCodeExporter commented 8 years ago
If I just do a read I get:
url http://localhost:8080/ta/1
({"id":"1",
"error":"java.lang.RuntimeException: javax.naming.NoInitialContextException: 
Need to
specify class name in environment or system property, or as an applet 
parameter, or
in an application resource file:  java.naming.factory.initial"

Original comment by beau.cro...@gmail.com on 1 Sep 2009 at 8:13

GoogleCodeExporter commented 8 years ago
OK, I fixed a couple of problems with the writing code.  I'm really sorry about 
that.  I am attaching new versions 
of the classes.

This doesn't address the issue in comment 12, which I think has to do with 
Jetty configuration.  I'll look into that.

Original comment by dsta...@gmail.com on 1 Sep 2009 at 10:00

Attachments:

GoogleCodeExporter commented 8 years ago
Regarding comment 12, make sure you have added the jetty-plus and jetty-naming 
jars.  I added them to the 
top level lib folder (where the base jetty jar file is).  

Also, you then need to modify the startup scripts to include these new jars in 
the classpath.  When I saw this 
problem in development, it was because I had a mistake in my launch script that 
prevented the classpath from 
being interpreted correctly.  I am attaching modified launch scripts that work 
for me.

Original comment by dsta...@gmail.com on 1 Sep 2009 at 10:10

Attachments:

GoogleCodeExporter commented 8 years ago
Slightly further now:
Retrieval (a = load("a/1")):
a.toSource()
({error:"java.lang.RuntimeException: javax.naming.NameNotFoundException; 
remaining
name 'env/jdbc/DATASOURCENAME'"})

Creation:
address()
java.sql.SQLException: transactionConnection is not valid in call to 
abortTransaction!
        at
org.persvr.datasource.PooledDatabaseDataSource.abortTransaction(PooledDatabaseDa
taSource.java:272)
        at org.persvr.data.Transaction.commit(Transaction.java:289)
        at org.persvr.util.Console.evaluate(Console.java:134)
        at org.persvr.util.Console$1.run(Console.java:113)
        at org.persvr.security.UserSecurity.doPriviledgedAction(UserSecurity.java:50)
        at org.persvr.util.Console.run(Console.java:54)
java.lang.RuntimeException: java.lang.RuntimeException:
javax.naming.NameNotFoundException; remaining name 'env/jdbc/DATASOURCENAME'
        at org.persvr.data.Transaction.commit(Transaction.java:295)
        at org.persvr.util.Console.evaluate(Console.java:134)
        at org.persvr.util.Console$1.run(Console.java:113)
        at org.persvr.security.UserSecurity.doPriviledgedAction(UserSecurity.java:50)
        at org.persvr.util.Console.run(Console.java:54)
Caused by: java.lang.RuntimeException: javax.naming.NameNotFoundException; 
remaining
name 'env/jdbc/DATASOURCENAME'
        at
org.persvr.datasource.PooledDatabaseDataSource.createConnection(PooledDatabaseDa
taSource.java:131)
        at
org.persvr.datasource.PooledDatabaseDataSource.startTransaction(PooledDatabaseDa
taSource.java:264)
        at org.persvr.data.Transaction.addAffectedSource(Transaction.java:100)
        at org.persvr.data.Transaction.commit(Transaction.java:227)
        ... 4 more
Caused by: javax.naming.NameNotFoundException; remaining name 
'env/jdbc/DATASOURCENAME'
        at org.mortbay.naming.NamingContext.lookup(NamingContext.java:634)
        at org.mortbay.naming.NamingContext.lookup(NamingContext.java:665)
        at org.mortbay.naming.NamingContext.lookup(NamingContext.java:680)
        at org.mortbay.naming.java.javaRootURLContext.lookup(javaRootURLContext.java:112)
        at javax.naming.InitialContext.lookup(InitialContext.java:392)
        at
org.persvr.datasource.PooledDatabaseDataSource.lookupDatasource(PooledDatabaseDa
taSource.java:111)
        at
org.persvr.datasource.PooledDatabaseDataSource.createConnection(PooledDatabaseDa
taSource.java:125)
        ... 7 more

Original comment by beau.cro...@gmail.com on 2 Sep 2009 at 12:29

GoogleCodeExporter commented 8 years ago
I'm not an expert on configuring Jetty, but it looks like it isn't finding the 
JNDI reference you create in jetty-
web.xml.  I'm not sure whether it is even processing your jetty-web.xml file.

A couple of things to try: make sure jetty-web.xml is a direct child of the 
WEB-INF directory in your 
application, make sure jetty-web.xml is structured correctly (the fragment you 
pasted above looks correct, 
but there should also be some xml header info), and that it is actually named 
jetty-web.xml.  You could also 
try modifying the name you specify in config/p.json to an intentionally 
incorrect value to see if that changes 
the error you get, or try deleting the contents of jetty-web.xml to see if that 
changes the error you get.

Original comment by dsta...@gmail.com on 2 Sep 2009 at 6:05

GoogleCodeExporter commented 8 years ago
Thanks again for this excellent contribution.

Original comment by kris...@gmail.com on 4 Sep 2009 at 7:18