Closed GoogleCodeExporter closed 9 years ago
Hi,
I'm sorry this will require some more time. Merge joins
are not yet implemented in H2. There is a feature request,
and I will increase the priority however.
There is a workaround, it is to give H2 a 'hint' not
to use the index on e1.value. See the last query (which is fast):
DROP TABLE entity;
DROP TABLE relation;
CREATE TABLE entity (id INT NOT NULL, rel_id INT,
value VARCHAR(200), PRIMARY KEY(id));
CREATE INDEX rel_index ON entity(rel_id);
CREATE INDEX value_index ON entity(value);
CREATE TABLE relation (id INT NOT NULL, PRIMARY KEY(id));
INSERT INTO relation(id) select x from system_range(1, 10);
INSERT INTO entity(id, rel_id, value)
select x, casewhen(rand()<0.1, mod(x, 10), null),
'COMMONVALUE' from system_range(1, 1000);
SELECT e1.value FROM entity AS e1, relation, entity AS e2
WHERE ((e1.value = 'COMMONVALUE')
AND ((e2.value = 'NOTEXISTVALUE1') OR (e2.value = 'NOTEXISTVALUE2')))
AND ((e2.rel_id = relation.id) AND (e1.rel_id = relation.id));
SELECT e1.value FROM entity AS e1, relation, entity AS e2
WHERE (('x' || e1.value = 'x' || 'COMMONVALUE')
AND (e2.value in ('NOTEXISTVALUE1', 'NOTEXISTVALUE2')))
AND ((e2.rel_id = relation.id) AND (e1.rel_id = relation.id));
Original comment by thomas.t...@gmail.com
on 5 Dec 2009 at 2:54
Got it, thanks!
Unfortunately, we can't hack the SQL select query because it's being generated
by JPA
provider. Is there some workaround that can be executed as separate query
during DB
initialization?
Here's what I got from optimizer instrumentation:
Computing cost for plan:
[E1, PUBLIC.RELATION, E2]
cost before adjusting = 40.0
cost after adjusting = 38.8
org.h2.table.PlanItem@12940b3 = 38.8
cost before adjusting = 30.0
cost after adjusting = 29.7
org.h2.table.PlanItem@156b6b9 = 29.7
cost before adjusting = 40.0
cost after adjusting = 39.6
org.h2.table.PlanItem@1f66cff = 39.6
[E1, PUBLIC.RELATION, E2] = 49607.515999999996
Computing cost for plan:
[E1, E2, PUBLIC.RELATION]
cost before adjusting = 40.0
cost after adjusting = 38.8
org.h2.table.PlanItem@16de49c = 38.8
cost before adjusting = 40.0
cost after adjusting = 39.4
org.h2.table.PlanItem@1bbf1ca = 39.4
cost before adjusting = 30.0
cost after adjusting = 29.8
org.h2.table.PlanItem@1ff0dde = 29.8
[E1, E2, PUBLIC.RELATION] = 49523.935999999994
First plan is very fast and second is very slow (as described in original
issue). The
cost difference is very low though. What can be done to dope the cost for the
first plan?
Original comment by aleksey....@gmail.com
on 5 Dec 2009 at 3:05
> Is there some workaround
No, I'm afraid not. You can't use a different selectivity for the column,
because it's actually the same column...
Original comment by thomas.t...@gmail.com
on 6 Dec 2009 at 6:51
So optimizer can't get the information that rel_id is mostly null? If it does,
then
plan [E1, PUBLIC.RELATION, E2] will cost a lot less than [E1, E2,
PUBLIC.RELATION].
Original comment by aleksey....@gmail.com
on 6 Dec 2009 at 7:29
Hi,
> So optimizer can't get the information that rel_id is mostly null?
Yes, the optimizer knows that (selectivity is 1).
> If it does, then plan [E1, PUBLIC.RELATION, E2] will cost a lot less than
[E1, E2,
PUBLIC.RELATION].
Actually, I think the problem is not the query plan, but that null should be
ignored
for index lookup... Example:
drop table test;
create table test(id int) as select x from system_range(1, 10);
insert into test select null from system_range(1, 2000);
create index idx_id on test(id);
analyze;
select * from test t1, test t2 where t1.id = t2.id;
The query takes 400 ms. It is using an index lookup, which is OK. But it should
ignore the rows where t1.id is null, and it doesn't do that currently - it
looks like
it is matching all t1.id=null against all rows where t2.id=null, and filter the
rows
later (in SQL, NULL=x is NULL). It should filter the rows immediately.
I will have a look.
Regards,
Thomas
Original comment by thomas.t...@gmail.com
on 8 Dec 2009 at 8:36
Original comment by thomas.t...@gmail.com
on 8 Dec 2009 at 8:36
This problem should be fixed in version 1.2.126
Original comment by thomas.t...@gmail.com
on 18 Dec 2009 at 9:18
Original comment by thomas.t...@gmail.com
on 18 Dec 2009 at 9:18
Verified in production.
Degradation is gone, thanks!
Original comment by aleksey....@gmail.com
on 18 Dec 2009 at 11:19
I ran the above test and it again took half a second. I tried it with 20000
NULLs and it took nearly one minute. It looks like a regression.
Original comment by Maaarti...@gmail.com
on 5 Apr 2011 at 10:14
Original issue reported on code.google.com by
aleksey....@gmail.com
on 5 Dec 2009 at 1:56Attachments: