sagarswathi / h2database

Automatically exported from code.google.com/p/h2database
0 stars 1 forks source link

When there is a multi-column primary key, H2 does not seem to always pick the right index #389

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
The problem seems to happens in this particular case:
- there is a primary key defined on two columns: (K1,K2)
- there is also an index (K1,VAL)

When H2 receives this query:

  SELECT * FROM my_table WHERE K1=7 ORDER BY K1,VAL

according to the EXPLAIN PLAN FOR command, H2 seems to use the index defined by 
the primary key (K1,K2), instead of the other index (K1,VAL).
I believe if H2 used the (K1,VAL) index, it would be faster because the result 
would already be sorted.

This problem does not seem to happen if instead of a primary key, I define 
(K1,K2) as a primary key with a unique constraint.

Here are the commands
CREATE TABLE my_table(
    K1 INT,
    K2 INT,
    VAL VARCHAR,
    PRIMARY KEY(K1,K2));
CREATE INDEX my_index ON my_table(K1,VARCHAR);
EXPLAIN PLAN FOR SELECT * FROM my_table WHERE K1=7 ORDER BY K1,VAL

I'm using the last version of H2, 1.3.165, on MacOS.

If you could take a look at this, that would be very helpful.
Oh, and by the way - thanks for the hard work!

Original issue reported on code.google.com by cleme...@google.com on 5 Apr 2012 at 1:17

GoogleCodeExporter commented 8 years ago
Note to self: 

The problem here is that the following method in org.h2.index.Index:
    double getCost(Session session, int[] masks);
has no way of distinguishing between the costs of accessing the two different 
indices.

Working back up the call-tree :
   org.h2.table.Table#getBestPlanItem
   org.h2.table.TableFilter#getBestPlanItem
   org.h2.table.Plan#calculateCost
   org.h2.command.dml.Optimizer#testPlan

we'd need to teach the code somewhere here about index-only vs. table-data 
scans.

Original comment by noelgrandin on 7 Feb 2013 at 1:55

GoogleCodeExporter commented 8 years ago
Issue 409 has been merged into this issue.

Original comment by noelgrandin on 8 Feb 2013 at 2:10

GoogleCodeExporter commented 8 years ago
Yes, one solution would be to extend getCost(Session session, int[] masks) to 
include the sort order. For example:
getCost(Session session, int[] masks, SortOrder order)

Original comment by thomas.t...@gmail.com on 9 Feb 2013 at 11:13

GoogleCodeExporter commented 8 years ago
Fixed in revision 4720

Original comment by noelgrandin on 3 Apr 2013 at 3:32

GoogleCodeExporter commented 8 years ago

Original comment by noelgrandin on 3 Apr 2013 at 3:32