sagarswathi / h2database

Automatically exported from code.google.com/p/h2database
0 stars 1 forks source link

Slow because of using "OR" conditions #345

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
What steps will reproduce the problem?
I'm using the built-in full text search on two tables that have a parent-child 
relationship and I encountered what appears to be deadlock in a single thread.

Here's a simplification of what I'm doing, except my tables have a lot more 
columns and my tables each have about 100,000 rows in them. This example does 
not reproduce the deadlock.

1. CREATE TABLE EMPLOYEES(ID INT PRIMARY KEY, NAME VARCHAR(255), ADDRESS_ID INT)
2. CREATE TABLE ADDRESSES(ID INT PRIMARY KEY,  ADDRESS VARCHAR(255))
3. insert into addresses values (1, 'Washington, DC')
4. insert into employees  values (1, 'John Doe', 1)
5. insert into addresses values (1, 'Fairfax, VA')
6. insert into employees  values (2, 'George Washington', 1)
7. call FT_CREATE_INDEX ('PUBLIC', 'EMPLOYEES', null)
8. call FT_CREATE_INDEX ('PUBLIC', 'ADDRESSES', null)
9. select E.* from EMPLOYEES E where E.address_id in (select A.ID from 
FT_SEARCH_DATA('Fairfax', 0, 0) S, ADDRESSES A where S.TABLE='ADDRESSES' and 
A.ID=S.KEYS[0]) or (select T.ID from FT_SEARCH_DATA('Fairfax', 0, 0) S, 
EMPLOYEES T where S.TABLE='EMPLOYEES ' and T.ID=S.KEYS[0])

What is the expected output? What do you see instead?
"pool-2-thread-8" prio=10 tid=0x8e10b000 nid=0x55a2 runnable [0x8e3ac000]
   java.lang.Thread.State: RUNNABLE
    at org.h2.table.RegularTable.lock(RegularTable.java:438)
    - locked <0x955b4b38> (a org.h2.engine.Database)
    at org.h2.table.TableFilter.lock(TableFilter.java:142)
    at org.h2.command.dml.Select.queryWithoutCache(Select.java:590)
    at org.h2.command.dml.Query.query(Query.java:298)
    at org.h2.command.dml.Query.query(Query.java:268)
    at org.h2.command.dml.Query.query(Query.java:37)
    at org.h2.command.CommandContainer.query(CommandContainer.java:80)
    at org.h2.command.Command.executeQuery(Command.java:181)
    - locked <0x955b4b38> (a org.h2.engine.Database)
    at org.h2.jdbc.JdbcPreparedStatement.executeQuery(JdbcPreparedStatement.java:96)
    - locked <0x959b9a48> (a org.h2.engine.Session)
    at org.h2.fulltext.FullText.search(FullText.java:608)
    at org.h2.fulltext.FullText.searchData(FullText.java:304)
    at sun.reflect.GeneratedMethodAccessor28.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.h2.engine.FunctionAlias$JavaMethod.getValue(FunctionAlias.java:403)
    at org.h2.expression.JavaFunction.getValue(JavaFunction.java:38)
    at org.h2.table.FunctionTable.getValueResultSet(FunctionTable.java:191)
    at org.h2.table.FunctionTable.getResult(FunctionTable.java:162)
    at org.h2.index.FunctionIndex.find(FunctionIndex.java:45)
    at org.h2.index.BaseIndex.find(BaseIndex.java:102)
    at org.h2.index.IndexCursor.find(IndexCursor.java:145)
    at org.h2.table.TableFilter.next(TableFilter.java:321)
    at org.h2.command.dml.Select.queryFlat(Select.java:512)
    at org.h2.command.dml.Select.queryWithoutCache(Select.java:617)
    at org.h2.command.dml.Query.query(Query.java:298)
    at org.h2.command.dml.Query.query(Query.java:268)
    at org.h2.expression.ConditionInSelect.getValue(ConditionInSelect.java:45)
    at org.h2.expression.ConditionAndOr.getValue(ConditionAndOr.java:83)
    at org.h2.expression.Expression.getBooleanValue(Expression.java:180)
    at org.h2.command.dml.Select.queryFlat(Select.java:514)
    at org.h2.command.dml.Select.queryWithoutCache(Select.java:617)
    at org.h2.command.dml.Query.query(Query.java:298)
    at org.h2.command.dml.Query.query(Query.java:268)
    at org.h2.command.dml.Query.query(Query.java:37)
    at org.h2.command.CommandContainer.query(CommandContainer.java:80)
    at org.h2.command.Command.executeQuery(Command.java:181)
    - locked <0x955b4b38> (a org.h2.engine.Database)
    at org.h2.jdbc.JdbcStatement.executeInternal(JdbcStatement.java:173)
    - locked <0x959b9a48> (a org.h2.engine.Session)
    at org.h2.jdbc.JdbcStatement.execute(JdbcStatement.java:152)
    at org.h2.server.web.WebApp.getResult(WebApp.java:1311)
    at org.h2.server.web.WebApp.query(WebApp.java:1001)
    at org.h2.server.web.WebApp.query(WebApp.java:977)
    at org.h2.server.web.WebApp.process(WebApp.java:217)
    at org.h2.server.web.WebApp.processRequest(WebApp.java:164)
    at org.h2.server.web.WebServlet.doGet(WebServlet.java:118)
    at org.h2.server.web.WebServlet.doPost(WebServlet.java:153)
    at javax.servlet.http.HttpServlet.service(HttpServlet.java:727)
    at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
    at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:533)
    at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1351)
    at com.argonst.nps.authorization.AuthorizationFilter.doFilter(AuthorizationFilter.java:84)
    at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1322)
    at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:473)
    at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:119)
    at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:514)
    at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)
    at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:920)
    at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:403)
    at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:184)
    at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:856)
    at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:117)
    at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:247)
    at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:151)
    at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:114)
    at org.eclipse.jetty.server.Server.handle(Server.java:352)
    at org.eclipse.jetty.server.HttpConnection.handleRequest(HttpConnection.java:596)
    at org.eclipse.jetty.server.HttpConnection$RequestHandler.content(HttpConnection.java:1066)
    at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:805)
    at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:218)
    at org.eclipse.jetty.server.HttpConnection.handle(HttpConnection.java:426)
    at org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:510)
    at org.eclipse.jetty.io.nio.SelectChannelEndPoint.access$000(SelectChannelEndPoint.java:34)
    at org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:40)
    at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
    at java.lang.Thread.run(Thread.java:619)

   Locked ownable synchronizers:
    - <0x95b8e600> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)

What version of the product are you using? On what operating system, file
system, and virtual machine?
Red Hat Enterprise Linux ES release 4 (Nahant Update 3) i686
Java(TM) SE Runtime Environment (build 1.6.0_20-b02)
H2-1.3.159

Do you know a workaround?
No, had to restart process.

What is your use case, meaning why do you need this feature?
Need this to do fulltext search two tables with parent/child relationship.

How important/urgent is the problem for you?
Important! Will perform search on only child table as workaround until fixed.

Please provide any additional information below.
Would be happy to retest on our data if needed.

Original issue reported on code.google.com by kroh.and...@gmail.com on 16 Sep 2011 at 6:05

GoogleCodeExporter commented 8 years ago
Hi,

What version of H2 do you use? A deadlock problem was solved in version 1.3.158 
(2011-07-17) (see the change log).

Could you post the complete thread dump? The stack trace you posted doesn't 
show anything special (anyway at least 2 threads are required for a deadlock).

Regards,
Thomas

Original comment by thomas.t...@gmail.com on 22 Sep 2011 at 4:33

GoogleCodeExporter commented 8 years ago
I'm using 1.3.159. I **was** suspecting some sort of custom non-reentrant 
locking which could deadlock a single thread. But after having some patience I 
watched the thread for 30 minutes and the query did finish (however the H2 web 
console never got the results, probably timed-out). I sampled the thread while 
it was running and have attached the stack traces.

So now my question turns to why does the the aggregate query take so long when 
running the two individual queries take milliseconds.

// Sub query 1: get search hits from child table
select R.* from PARENT R where R.CHILD_ID in (select M.ID from 
FT_SEARCH_DATA('999', 0, 0) S, CHILD M where S.TABLE='CHILD' and M.ID=S.KEYS[0])
(164 rows, 384 ms)

// Sub query 2: get search hits from parent table
select R.* from PARENT R where R.ID in (select T.ID from FT_SEARCH_DATA('999', 
0, 0) S, PARENT T where S.TABLE='PARENT' and T.ID=S.KEYS[0])
(203 rows, 449 ms)

// Aggregate query: get search hits from child table AND parent table
select R.* from PARENT R where R.CHILD_ID in (select M.ID from 
FT_SEARCH_DATA('999', 0, 0) S, CHILD M where S.TABLE='CHILD' and 
M.ID=S.KEYS[0]) or R.ID in (select T.ID from FT_SEARCH_DATA('999', 0, 0) S, 
PARENT T where S.TABLE='PARENT' and T.ID=S.KEYS[0])
(~30mins)

Thanks for looking into this,
Andrew

Original comment by kroh.and...@gmail.com on 22 Sep 2011 at 6:02

Attachments:

GoogleCodeExporter commented 8 years ago
Hi,

Is it possible to use SELECT ... UNION SELECT ... instead of OR? H2 doesn't 
currently convert OR to a UNION, and probably scans all rows in this case, plus 
for each row re-runs the inner query.

Regards,
Thomas

Original comment by thomas.t...@gmail.com on 22 Sep 2011 at 5:49

GoogleCodeExporter commented 8 years ago
Thanks, Thomas. Using UNION SELECT in the query is the solution to my problem. 
My SQL naivety is showing.

select R.* from PARENT R where R.CHILD_ID in (select M.ID from 
FT_SEARCH_DATA('999', 0, 0) S, CHILD M where S.TABLE='CHILD' and 
M.ID=S.KEYS[0]) UNION SELECT R.* from PARENT R where R.ID in (select T.ID from 
FT_SEARCH_DATA('999', 0, 0) S, PARENT T where S.TABLE='PARENT' and 
T.ID=S.KEYS[0])
(327 rows, 941 ms)

Original comment by kroh.and...@gmail.com on 22 Sep 2011 at 6:38

GoogleCodeExporter commented 8 years ago
> My SQL naivety is showing.

The problem isn't your knowledge of SQL, the problem is that H2 is still quite 
dumb in this area.

Original comment by thomas.t...@gmail.com on 22 Sep 2011 at 7:43

GoogleCodeExporter commented 8 years ago
So it wasn't actually a deadlock, but just slow.

Original comment by thomas.t...@gmail.com on 12 Oct 2011 at 5:23

GoogleCodeExporter commented 8 years ago
There is already a feature request for "Optimize A=? OR B=? to UNION if the 
cost is lower.", I will increase the priority.

Original comment by thomas.t...@gmail.com on 12 Oct 2011 at 5:24