Closed GoogleCodeExporter closed 9 years ago
Are you trying this on master? I don't get NPE for this query.
Original comment by icetin...@gmail.com
on 30 Apr 2014 at 8:40
[deleted comment]
I was using my local branch. :)
I will try this in master branch and invalidate this issue if this exception is
not thrown.
Original comment by kiss...@gmail.com
on 30 Apr 2014 at 6:31
I tested on the master branch. It shows the same error as I showed in the
initial issue report.
The following shows the ddl statements.
--------------- ddl statements --------------
drop dataverse feeds if exists;
create dataverse feeds;
use dataverse feeds;
create type TwitterUserType as closed {
screen-name: string,
lang: string,
friends-count: int32,
statuses-count: int32,
name: string,
followers-count: int32
}
create type TweetMessageType as closed {
tweetid: int64,
user: TwitterUserType,
sender-location: point,
send-time: datetime,
referred-topics: {{ string }},
message-text: string,
countA: int32,
countB: int32
}
create dataset TweetMessages(TweetMessageType)
primary key tweetid;
create index twmSndLocIx on TweetMessages(sender-location) type rtree;
create index msgCountAIx on TweetMessages(countA) type btree;
create index msgCountBIx on TweetMessages(countB) type btree;
create index msgTextIx on TweetMessages(message-text) type keyword;
------------ query ----------------------
use dataverse feeds;
for $t1 in dataset('TweetMessages')
for $t2 in dataset('TweetMessages')
let $sim := similarity-jaccard-check($t1.message-text, $t2.message-text, 0.6f)
where $sim[0] and $t1.tweetid < int64("20") and $t2.tweetid != $t1.tweetid
return {
"t1": $t1.tweetid,
"t2": $t2.tweetid,
"sim": $sim[1]
}
Original comment by kiss...@gmail.com
on 1 May 2014 at 4:36
I added the log message from cc.log below.
------------------------
INFO: Cleanup for JobRun with id: JID:8
Apr 30, 2014 9:38:13 PM
edu.uci.ics.hyracks.control.common.work.WorkQueue$WorkerThread run
INFO: Executing:
edu.uci.ics.hyracks.control.cc.work.JobletCleanupNotificationWork@15a8e003
java.lang.NullPointerException
at edu.uci.ics.asterix.optimizer.rules.am.InvertedIndexAccessMethod.applyJoinPlanTransformation(InvertedIndexAccessMethod.java:408)
at edu.uci.ics.asterix.optimizer.rules.am.IntroduceJoinAccessMethodRule.rewritePost(IntroduceJoinAccessMethodRule.java:128)
at edu.uci.ics.hyracks.algebricks.core.rewriter.base.AbstractRuleController.rewriteOperatorRef(AbstractRuleController.java:122)
at edu.uci.ics.hyracks.algebricks.core.rewriter.base.AbstractRuleController.rewriteOperatorRef(AbstractRuleController.java:96)
at edu.uci.ics.hyracks.algebricks.core.rewriter.base.AbstractRuleController.rewriteOperatorRef(AbstractRuleController.java:96)
at edu.uci.ics.hyracks.algebricks.core.rewriter.base.AbstractRuleController.rewriteOperatorRef(AbstractRuleController.java:96)
at edu.uci.ics.hyracks.algebricks.compiler.rewriter.rulecontrollers.SequentialFixpointRuleController.rewriteWithRuleCollection(SequentialFixpointRuleController.java:49)
at edu.uci.ics.hyracks.algebricks.core.rewriter.base.HeuristicOptimizer.runOptimizationSets(HeuristicOptimizer.java:91)
at edu.uci.ics.hyracks.algebricks.core.rewriter.base.HeuristicOptimizer.optimize(HeuristicOptimizer.java:78)
at edu.uci.ics.hyracks.algebricks.compiler.api.HeuristicCompilerFactoryBuilder$1$1.optimize(HeuristicCompilerFactoryBuilder.java:83)
at edu.uci.ics.asterix.api.common.APIFramework.compileQuery(APIFramework.java:285)
at edu.uci.ics.asterix.aql.translator.AqlTranslator.rewriteCompileQuery(AqlTranslator.java:1428)
at edu.uci.ics.asterix.aql.translator.AqlTranslator.handleQuery(AqlTranslator.java:1724)
at edu.uci.ics.asterix.aql.translator.AqlTranslator.compileAndExecute(AqlTranslator.java:301)
at edu.uci.ics.asterix.api.http.servlet.APIServlet.doPost(APIServlet.java:97)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:754)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:847)
at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:546)
at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:483)
at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)
at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:970)
at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:411)
at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:192)
at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:904)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:117)
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:110)
at org.eclipse.jetty.server.Server.handle(Server.java:347)
at org.eclipse.jetty.server.HttpConnection.handleRequest(HttpConnection.java:439)
at org.eclipse.jetty.server.HttpConnection$RequestHandler.content(HttpConnection.java:924)
at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:781)
at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:220)
at org.eclipse.jetty.server.AsyncHttpConnection.handle(AsyncHttpConnection.java:43)
at org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:545)
at org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:43)
at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:529)
at java.lang.Thread.run(Thread.java:722)
Apr 30, 2014 9:38:25 PM edu.uci.ics.asterix.api.http.servlet.APIServlet doPost
SEVERE: null
java.lang.NullPointerException
at edu.uci.ics.asterix.optimizer.rules.am.InvertedIndexAccessMethod.applyJoinPlanTransformation(InvertedIndexAccessMethod.java:408)
at edu.uci.ics.asterix.optimizer.rules.am.IntroduceJoinAccessMethodRule.rewritePost(IntroduceJoinAccessMethodRule.java:128)
at edu.uci.ics.hyracks.algebricks.core.rewriter.base.AbstractRuleController.rewriteOperatorRef(AbstractRuleController.java:122)
at edu.uci.ics.hyracks.algebricks.core.rewriter.base.AbstractRuleController.rewriteOperatorRef(AbstractRuleController.java:96)
at edu.uci.ics.hyracks.algebricks.core.rewriter.base.AbstractRuleController.rewriteOperatorRef(AbstractRuleController.java:96)
at edu.uci.ics.hyracks.algebricks.core.rewriter.base.AbstractRuleController.rewriteOperatorRef(AbstractRuleController.java:96)
at edu.uci.ics.hyracks.algebricks.compiler.rewriter.rulecontrollers.SequentialFixpointRuleController.rewriteWithRuleCollection(SequentialFixpointRuleController.java:49)
at edu.uci.ics.hyracks.algebricks.core.rewriter.base.HeuristicOptimizer.runOptimizationSets(HeuristicOptimizer.java:91)
at edu.uci.ics.hyracks.algebricks.core.rewriter.base.HeuristicOptimizer.optimize(HeuristicOptimizer.java:78)
at edu.uci.ics.hyracks.algebricks.compiler.api.HeuristicCompilerFactoryBuilder$1$1.optimize(HeuristicCompilerFactoryBuilder.java:83)
at edu.uci.ics.asterix.api.common.APIFramework.compileQuery(APIFramework.java:285)
at edu.uci.ics.asterix.aql.translator.AqlTranslator.rewriteCompileQuery(AqlTranslator.java:1428)
at edu.uci.ics.asterix.aql.translator.AqlTranslator.handleQuery(AqlTranslator.java:1724)
at edu.uci.ics.asterix.aql.translator.AqlTranslator.compileAndExecute(AqlTranslator.java:301)
at edu.uci.ics.asterix.api.http.servlet.APIServlet.doPost(APIServlet.java:97)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:754)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:847)
at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:546)
at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:483)
at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)
at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:970)
at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:411)
at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:192)
at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:904)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:117)
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:110)
at org.eclipse.jetty.server.Server.handle(Server.java:347)
at org.eclipse.jetty.server.HttpConnection.handleRequest(HttpConnection.java:439)
at org.eclipse.jetty.server.HttpConnection$RequestHandler.content(HttpConnection.java:924)
at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:781)
at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:220)
at org.eclipse.jetty.server.AsyncHttpConnection.handle(AsyncHttpConnection.java:43)
at org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:545)
at org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:43)
at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:529)
at java.lang.Thread.run(Thread.java:722)
Apr 30, 2014 9:38:58 PM
edu.uci.ics.hyracks.control.common.dataset.ResultStateSweeper sweep
INFO: Result state cleanup instance successfully completed.
Original comment by kiss...@gmail.com
on 1 May 2014 at 4:40
OK, I'll look into this. When I run it I didn't have all the indexes that you
have, so I didn't hip NPE. Now you have provided your DDL statements, I can
reproduce the issue.
Original comment by icetin...@gmail.com
on 1 May 2014 at 6:15
After some investigation I realized this issue is not specific to fuzzy join,
it can also happen with any other index access methods. The problem occurs when
we apply multiple access methods to the same plan. Our current rule assumes
that the source of the data is always DataSourceScan; however after we apply
b-tree access method to the plan, we don't scan the data anymore but retrieve
the data from index. So we need to make sure that our rule considers this data
source.
Original comment by icetin...@gmail.com
on 1 May 2014 at 9:18
I think that only inverted index secondary index is not working when there is
another index is picked in the inner join plan.
The following queries that use one primary index and one secondary index
(either btree or rtree) DO work without showing the exception. I also showed
the corresponding query plans for each query below.
#query1------------------------------------------
use dataverse feeds;
for $t1 in dataset('TweetMessages')
for $t2 in dataset('TweetMessages')
where $t1.countA /*+indexnl*/ = $t2.countB and $t1.tweetid < int64("20") and
$t2.tweetid != $t1.tweetid
return {
"t1": $t1.tweetid,
"t2": $t2.tweetid
}
#plan1
distribute result [%0->$$14]
-- DISTRIBUTE_RESULT |PARTITIONED|
exchange
-- ONE_TO_ONE_EXCHANGE |PARTITIONED|
project ([$$14])
-- STREAM_PROJECT |PARTITIONED|
assign [$$14] <- [function-call: asterix:closed-record-constructor, Args:[AString: {t1}, %0->$$19, AString: {t2}, %0->$$20]]
-- ASSIGN |PARTITIONED|
project ([$$19, $$20])
-- STREAM_PROJECT |PARTITIONED|
select (function-call: algebricks:and, Args:[function-call: algebricks:neq, Args:[%0->$$20, %0->$$19], function-call: algebricks:eq, Args:[%0->$$21, function-call: asterix:field-access-by-index, Args:[%0->$$1, AInt32: {7}]]])
-- STREAM_SELECT |PARTITIONED|
project ([$$1, $$19, $$21, $$20])
-- STREAM_PROJECT |PARTITIONED|
exchange
-- ONE_TO_ONE_EXCHANGE |PARTITIONED|
unnest-map [$$20, $$1] <- function-call: asterix:index-search, Args:[AString: {TweetMessages}, AInt32: {0}, AString: {feeds}, AString: {TweetMessages}, ABoolean: {true}, ABoolean: {false}, AInt32: {1}, %0->$$27, AInt32: {1}, %0->$$27, TRUE, TRUE, FALSE]
-- BTREE_SEARCH |PARTITIONED|
exchange
-- ONE_TO_ONE_EXCHANGE |PARTITIONED|
order (ASC, %0->$$27)
-- STABLE_SORT [$$27(ASC)] |PARTITIONED|
exchange
-- ONE_TO_ONE_EXCHANGE |PARTITIONED|
project ([$$19, $$21, $$27])
-- STREAM_PROJECT |PARTITIONED|
exchange
-- ONE_TO_ONE_EXCHANGE |PARTITIONED|
unnest-map [$$26, $$27] <- function-call: asterix:index-search, Args:[AString: {msgCountBIx}, AInt32: {0}, AString: {feeds}, AString: {TweetMessages}, ABoolean: {true}, ABoolean: {true}, AInt32: {1}, %0->$$21, AInt32: {1}, %0->$$21, TRUE, TRUE, TRUE]
-- BTREE_SEARCH |PARTITIONED|
exchange
-- BROADCAST_EXCHANGE |PARTITIONED|
project ([$$19, $$21])
-- STREAM_PROJECT |PARTITIONED|
assign [$$21] <- [function-call: asterix:field-access-by-index, Args:[%0->$$0, AInt32: {6}]]
-- ASSIGN |PARTITIONED|
project ([$$0, $$19])
-- STREAM_PROJECT |PARTITIONED|
exchange
-- ONE_TO_ONE_EXCHANGE |PARTITIONED|
unnest-map [$$19, $$0] <- function-call: asterix:index-search, Args:[AString: {TweetMessages}, AInt32: {0}, AString: {feeds}, AString: {TweetMessages}, ABoolean: {false}, ABoolean: {false}, AInt32: {0}, AInt32: {1}, %0->$$23, TRUE, FALSE, FALSE]
-- BTREE_SEARCH |PARTITIONED|
exchange
-- ONE_TO_ONE_EXCHANGE |PARTITIONED|
assign [$$23] <- [AInt64: {20}]
-- ASSIGN |PARTITIONED|
empty-tuple-source
-- EMPTY_TUPLE_SOURCE |PARTITIONED|
#query2------------------------------------------
use dataverse feeds;
for $t1 in dataset('TweetMessages')
for $t2 in dataset('TweetMessages')
let $n := create-circle($t1.sender-location, 0.5)
where spatial-intersect($t2.sender-location, $n) and $t1.tweetid < int64("20")
and $t1.tweetid != $t2.tweetid
return {
"tweetid1": $t1.tweetid,
"tweetid2": $t2.tweetid
};
#plan2
distribute result [%0->$$16]
-- DISTRIBUTE_RESULT |PARTITIONED|
exchange
-- ONE_TO_ONE_EXCHANGE |PARTITIONED|
project ([$$16])
-- STREAM_PROJECT |PARTITIONED|
assign [$$16] <- [function-call: asterix:closed-record-constructor, Args:[AString: {tweetid1}, %0->$$21, AString: {tweetid2}, %0->$$22]]
-- ASSIGN |PARTITIONED|
project ([$$21, $$22])
-- STREAM_PROJECT |PARTITIONED|
select (function-call: algebricks:and, Args:[function-call: algebricks:neq, Args:[%0->$$21, %0->$$22], function-call: asterix:spatial-intersect, Args:[function-call: asterix:field-access-by-index, Args:[%0->$$1, AInt32: {2}], %0->$$2]])
-- STREAM_SELECT |PARTITIONED|
project ([$$1, $$2, $$21, $$22])
-- STREAM_PROJECT |PARTITIONED|
exchange
-- ONE_TO_ONE_EXCHANGE |PARTITIONED|
unnest-map [$$22, $$1] <- function-call: asterix:index-search, Args:[AString: {TweetMessages}, AInt32: {0}, AString: {feeds}, AString: {TweetMessages}, ABoolean: {true}, ABoolean: {false}, AInt32: {1}, %0->$$36, AInt32: {1}, %0->$$36, TRUE, TRUE, FALSE]
-- BTREE_SEARCH |PARTITIONED|
exchange
-- ONE_TO_ONE_EXCHANGE |PARTITIONED|
order (ASC, %0->$$36)
-- STABLE_SORT [$$36(ASC)] |PARTITIONED|
exchange
-- ONE_TO_ONE_EXCHANGE |PARTITIONED|
project ([$$2, $$21, $$36])
-- STREAM_PROJECT |PARTITIONED|
exchange
-- ONE_TO_ONE_EXCHANGE |PARTITIONED|
unnest-map [$$32, $$33, $$34, $$35, $$36] <- function-call: asterix:index-search, Args:[AString: {twmSndLocIx}, AInt32: {1}, AString: {feeds}, AString: {TweetMessages}, ABoolean: {true}, ABoolean: {true}, AInt32: {4}, %0->$$28, %0->$$29, %0->$$30, %0->$$31]
-- RTREE_SEARCH |PARTITIONED|
exchange
-- BROADCAST_EXCHANGE |PARTITIONED|
assign [$$28, $$29, $$30, $$31] <- [function-call: asterix:create-mbr, Args:[%0->$$2, AInt32: {2}, AInt32: {0}], function-call: asterix:create-mbr, Args:[%0->$$2, AInt32: {2}, AInt32: {1}], function-call: asterix:create-mbr, Args:[%0->$$2, AInt32: {2}, AInt32: {2}], function-call: asterix:create-mbr, Args:[%0->$$2, AInt32: {2}, AInt32: {3}]]
-- ASSIGN |PARTITIONED|
project ([$$2, $$21])
-- STREAM_PROJECT |PARTITIONED|
assign [$$2] <- [function-call: asterix:create-circle, Args:[function-call: asterix:field-access-by-index, Args:[%0->$$0, AInt32: {2}], ADouble: {0.5}]]
-- ASSIGN |PARTITIONED|
project ([$$0, $$21])
-- STREAM_PROJECT |PARTITIONED|
exchange
-- ONE_TO_ONE_EXCHANGE |PARTITIONED|
unnest-map [$$21, $$0] <- function-call: asterix:index-search, Args:[AString: {TweetMessages}, AInt32: {0}, AString: {feeds}, AString: {TweetMessages}, ABoolean: {false}, ABoolean: {false}, AInt32: {0}, AInt32: {1}, %0->$$25, TRUE, FALSE, FALSE]
-- BTREE_SEARCH |PARTITIONED|
exchange
-- ONE_TO_ONE_EXCHANGE |PARTITIONED|
assign [$$25] <- [AInt64: {20}]
-- ASSIGN |PARTITIONED|
empty-tuple-source
-- EMPTY_TUPLE_SOURCE |PARTITIONED|
Original comment by kiss...@gmail.com
on 1 May 2014 at 4:40
Original comment by icetin...@gmail.com
on 8 May 2014 at 1:09
Original comment by icetin...@gmail.com
on 9 May 2014 at 12:22
Original issue reported on code.google.com by
kiss...@gmail.com
on 30 Apr 2014 at 7:55