Optimizer: index usage when both ascending and descending indexes are available

lbehnke / h2database

Automatically exported from code.google.com/p/h2database

0 stars 0 forks source link

Optimizer: index usage when both ascending and descending indexes are available #178

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago

H2 currently always uses the same index no matter what the query looks
like. It will use the first index:

drop table test;
create table test(id int);
create index idx_id_desc on test(id desc);
create index idx_id_asc on test(id);
explain select * from test where id > 100 limit 1;
explain select * from test where id < 100 limit 1;

If you create the ascending index first, it will use that one.

Original issue reported on code.google.com by thomas.t...@gmail.com on 20 Mar 2010 at 9:48

GoogleCodeExporter commented 9 years ago

Original comment by thomas.t...@gmail.com on 21 Mar 2010 at 11:31

GoogleCodeExporter commented 9 years ago

Original comment by thomas.t...@gmail.com on 21 Mar 2010 at 11:31

Changed state: Accepted

GoogleCodeExporter commented 9 years ago

Original comment by thomas.t...@gmail.com on 28 Jun 2010 at 6:40

Changed state: InRoadmap

GoogleCodeExporter commented 9 years ago

Solving this properly will require storing a histogram of the distribution of 
the index, and picking the index which returns the least data i.e. the index 
which has the lowest cost.

For simplicity, it should be sufficient to store a fixed size histogram (say 32 
entries) and interpolate cost results. That will cover the majority of common 
distributions (normal, bi-normal, exponential, power-law).

Original comment by noelgrandin on 7 Mar 2011 at 11:51

GoogleCodeExporter commented 9 years ago

> a histogram of the distribution of the index

Yes.

About sampling: I guess the current mechanism to pick entries should be 
improved a bit, so that random rows are picked, and not just the first few.

Original comment by thomas.t...@gmail.com on 11 Mar 2011 at 12:18

GoogleCodeExporter commented 9 years ago

The costing for the sample query is actually the same whether the index is 
ascending or descending. This is because once we have a histogram, it doesn't 
make any difference whether we are stepping forwards or backwards through the 
index.

Can anyone come up with a simple case where the choice of ascending/descending 
indexes actually should make a difference?

Original comment by noelgrandin on 14 Mar 2011 at 11:34