Multi-field index not used for some comparisons

GoogleCodeExporter commented 9 years ago

Running this AQL:

drop dataverse test if exists;
create dataverse test;
use dataverse test;
create type TestTypetmp as open {
    fname : string,
    lname : string
}
create dataset testdst(TestTypetmp) primary key fname,lname;
for $emp in dataset('testdst') 
where $emp.fname > "Alex" and $emp.lname < "Zach"
return $emp

Does not use the index but this AQL does:

drop dataverse test if exists;
create dataverse test;
use dataverse test;
create type TestTypetmp as open {
    fname : string,
    lname : string
}
create dataset testdst(TestTypetmp) primary key fname,lname;
for $emp in dataset('testdst') 
where $emp.fname > "Alex" and $emp.lname > "Zach"
return $emp

The only difference is the comparator on they second field.

Original issue reported on code.google.com by stevenja...@gmail.com on 3 Apr 2014 at 6:49

GoogleCodeExporter commented 9 years ago

The plans seem to diverge when the first version decides to switch to "Starting 
physical optimizations" but the second plan does not. Here are the local plan 
traces (Buggy one first):

Apr 03, 2014 12:13:14 PM 
edu.uci.ics.hyracks.algebricks.core.rewriter.base.AbstractRuleController 
printRuleApplication
FINE: >>>> After plan
select (function-call: algebricks:and, Args:[function-call: algebricks:gt, 
Args:[%0->$$7, AString: {Alex}], function-call: algebricks:lt, Args:[%0->$$8, 
AString: {Zach}]]) -- |PARTITIONED|
  data-scan []<-[$$7, $$8, $$0] <- test:testdst -- |PARTITIONED|
    empty-tuple-source -- |PARTITIONED|

Apr 03, 2014 12:13:14 PM 
edu.uci.ics.hyracks.algebricks.core.rewriter.base.HeuristicOptimizer 
runPhysicalOptimizations
FINE: Starting physical optimizations.

Apr 03, 2014 12:13:14 PM 
edu.uci.ics.hyracks.algebricks.core.rewriter.base.AbstractRuleController 
printRuleApplication
FINE: >>>> Rule class 
edu.uci.ics.hyracks.algebricks.rewriter.rules.SetAlgebricksPhysicalOperatorsRule
 fired.

Apr 03, 2014 12:13:14 PM 
edu.uci.ics.hyracks.algebricks.core.rewriter.base.AbstractRuleController 
printRuleApplication
FINE: >>>> Before plan
distribute result [%0->$$0] -- |PARTITIONED|
  project ([$$0]) -- |PARTITIONED|
    select (function-call: algebricks:and, Args:[function-call: algebricks:gt, Args:[%0->$$7, AString: {Alex}], function-call: algebricks:lt, Args:[%0->$$8, AString: {Zach}]]) -- |PARTITIONED|
      data-scan []<-[$$7, $$8, $$0] <- test:testdst -- |PARTITIONED|
        empty-tuple-source -- |PARTITIONED|

verses this one:

Apr 03, 2014 12:13:52 PM 
edu.uci.ics.hyracks.algebricks.core.rewriter.base.AbstractRuleController 
printRuleApplication
FINE: >>>> After plan
select (function-call: algebricks:and, Args:[function-call: algebricks:gt, 
Args:[%0->$$7, AString: {Alex}], function-call: algebricks:gt, Args:[%0->$$8, 
AString: {Zach}]]) -- |PARTITIONED|
  data-scan []<-[$$7, $$8, $$0] <- test:testdst -- |PARTITIONED|
    empty-tuple-source -- |PARTITIONED|

Apr 03, 2014 12:13:52 PM 
edu.uci.ics.hyracks.algebricks.core.rewriter.base.AbstractRuleController 
printRuleApplication
FINE: >>>> Rule class 
edu.uci.ics.asterix.optimizer.rules.am.IntroduceSelectAccessMethodRule fired.

Apr 03, 2014 12:13:52 PM 
edu.uci.ics.hyracks.algebricks.core.rewriter.base.AbstractRuleController 
printRuleApplication
FINE: >>>> Before plan
select (function-call: algebricks:and, Args:[function-call: algebricks:gt, 
Args:[%0->$$7, AString: {Alex}], function-call: algebricks:gt, Args:[%0->$$8, 
AString: {Zach}]]) -- |PARTITIONED|
  data-scan []<-[$$7, $$8, $$0] <- test:testdst -- |PARTITIONED|
    empty-tuple-source -- |PARTITIONED|

Original comment by stevenja...@gmail.com on 3 Apr 2014 at 7:27

GoogleCodeExporter commented 9 years ago

Take a look at this old issue, this one may be related.

Issue 174: Composite key primary BTree index does not get picked up in some 
range predicates

Original comment by khfaraaz82 on 3 Apr 2014 at 11:45

GoogleCodeExporter commented 9 years ago

I found the code producing this problem and it seems to be done on purpose 
(Basically if the limits come from opposite directions return null):

        // Rule out the cases unsupported by the current btree search
        // implementation.
        for (int i = 1; i < numSecondaryKeys; i++) {
            if (lowKeyLimits[0] == null && lowKeyLimits[i] != null || lowKeyLimits[0] != null
                    && lowKeyLimits[i] == null) {
            //    return null;
            }
            if (highKeyLimits[0] == null && highKeyLimits[i] != null || highKeyLimits[0] != null
                    && highKeyLimits[i] == null) {
            //    return null;
            }
        }

I am still working on why this code is here. simply taking it out obviously 
leads to problems :)

It looks like Sattam made this change originally. Sattam, do you remember 
anything about this?

Original comment by stevenja...@gmail.com on 7 Apr 2014 at 10:44

GoogleCodeExporter commented 9 years ago

This is a non-issue. It is simply a limitation on the current implementation of 
index search that we allow.

Original comment by sjaco...@ucr.edu on 21 Oct 2014 at 6:08

Removed labels: Type-Defect

GoogleCodeExporter commented 9 years ago

This is a non-issue. It is simply a limitation on the current implementation of 
index search that we allow.

Original comment by sjaco...@ucr.edu on 21 Oct 2014 at 6:10

Changed state: WontFix

GoogleCodeExporter commented 9 years ago

How can you say this is a non-issue?  I read that as "oh, this is broken but we 
knew that, which I hereby declare to mean non-issue, wontfix, close".  Please 
don't do that.  This is a legitimate query with a legitimate index 
applicability - worst case the second part of the predicate could be stripped 
off and the first part used to do the query and the second part used to filter 
false positives from the first part - no?

Original comment by dtab...@gmail.com on 21 Oct 2014 at 7:51

Changed state: Accepted

GoogleCodeExporter commented 9 years ago

PS: "Worst case" resolution of such things should be, with permission, to 
change a bug report into an enhancement request....

Original comment by dtab...@gmail.com on 21 Oct 2014 at 7:52

GoogleCodeExporter commented 9 years ago

In this case I realized that nothing is "broken" (Query returns correct 
results), which is why I thought of it as a non-issue. I definitely see why it 
could be added as an enhancement request though. I'm the one who made this 
request originally, so I was looking at it as something that I issued as a bug 
which was not actually a bug, and therefore I shouldn't have issued it in the 
first place.

Should I close it as a bug, and recreate it as an enhancement request with a 
better description of what should be changed?

Original comment by sjaco...@ucr.edu on 21 Oct 2014 at 8:54

GoogleCodeExporter commented 9 years ago

Yes, recreating this one as an enhancement request would be great.  There may 
already be a numbered issue for that (or something related) - not sure.  Thx!  
Correct but slow isn't a state we want to be in with "any" query, over time, if 
we can avoid it. :-)

Original comment by dtab...@gmail.com on 21 Oct 2014 at 10:50

GoogleCodeExporter commented 9 years ago

This issue has been recreated as an enhancement

Original comment by sjaco...@ucr.edu on 21 Oct 2014 at 11:12

Changed state: Done
Removed labels: Priority-Medium

namhnguyen / asterixdb

Multi-field index not used for some comparisons #751