Closed GoogleCodeExporter closed 9 years ago
This is interesting. The query befoer the insert gives this plan:
SELECT PE.DISP, P1.SID, P1.LANG_DE AS TYPE, P2.LANG_DE AS RIGHT
FROM PUBLIC.KONT KO /* PUBLIC.KONT_TABLE_SCAN */ /* WHERE KO.KID = '_4038' */
INNER JOIN PUBLIC.PERS PE /* PUBLIC.PERS_TABLE_SCAN */ ON 1=1 /* WHERE
KO.PERS_ID =
PE.PE_ID */
INNER JOIN PUBLIC.LANG P1 /* PUBLIC.LANG_TABLE_SCAN */ ON 1=1 /* WHERE
KO.TYPE_ID =
P1.LID */
INNER JOIN PUBLIC.LANG P2 /* PUBLIC.LANG_TABLE_SCAN */ ON 1=1
WHERE (KO.RIGHT_ID = P2.LID) AND ((KO.TYPE_ID = P1.LID) AND ((KO.KID = '_4038')
AND
(KO.PERS_ID = PE.PE_ID)))
ORDER BY 2, 1
And after the insert gives this plan:
SELECT PE.DISP, P1.SID, P1.LANG_DE AS TYPE, P2.LANG_DE AS RIGHT
FROM PUBLIC.PERS PE /* PUBLIC.PERS_TABLE_SCAN */
INNER JOIN PUBLIC.LANG P1 /* PUBLIC.LANG_TABLE_SCAN */ ON 1=1
INNER JOIN PUBLIC.LANG P2 /* PUBLIC.LANG_TABLE_SCAN */ ON 1=1
INNER JOIN PUBLIC.KONT KO /* PUBLIC.KONT_TABLE_SCAN */ ON 1=1
WHERE (KO.RIGHT_ID = P2.LID) AND ((KO.TYPE_ID = P1.LID) AND ((KO.KID = '_4038')
AND
(KO.PERS_ID = PE.PE_ID)))
ORDER BY 2, 1
If you create an index on *any* column on *any* table, it will always generate
the
first plan and run real fast. Drop the index, slow. Add the index, fast.
The fact that all the tables are lacking any indexes or primary key info
probably has
the optimizer going down some pessimal corner case.
It's a bug, but it seems pretty minor; personally I can't get *too* worked up
over
queries that are problematic on tables with no indexing information.
Chris
Original comment by cscha...@gmail.com
on 22 Aug 2009 at 3:32
To make sure that no false impression arises: This is not a simple thing like:
query
with indexing is fast, query without indexing is slow:
1. before inserting this single row (i.e. without bug) the select runs also
fast (on
my 6 year old PC 40ms) without any indexing
2. with this single row inserted (i.e. with bug effective) the select is not
only
slow: it never gets ready in practical time.
(I waited 1 hour). That means it is unusable in this use case.
I am using H2 as a test/development database. In this use case its not
necessary that
the database performs with maximal achievable performance.
Because of this and because the DB is not that big I had not to deal with
indexing.
Maybe this is a "pretty minor" bug for DB architects, who do not execute any
select
without creating an appropriate index first.
But there are surely other users (like me), who simply want to use H2 without
dealing
with indexing first (of course with not too big databases).
And H2 is capable to operate fast, even without indexing (see 1. above)
Ok, now that I know how to workaround this I can live with it, regardless it is
not a
"pretty minor" bug for me.
Charly
Original comment by gartenzw...@googlemail.com
on 23 Aug 2009 at 11:30
Not a bug. All databases work like this.
Original comment by thomas.t...@gmail.com
on 25 Aug 2009 at 10:12
> Not a bug. All databases work like this.
Lets have a look at some other databases, how they perform in this case.
The same tables and data from the attached script:
Executing 2. Select from script (minor modificated):
SELECT pe.DISP,p1.SID, p1.LANG_DE as type, p2.LANG_DE as rightx from KONT ko,
PERS
pe, LANG p1, LANG p2 where ko.KID='_4038' and
ko.PERS_ID=pe.PE_ID and ko.TYPE_ID=p1.LID and ko.RIGHT_ID=p2.LID order by
p1.SID, pe.DISP
Table KONT has 5936 rows, LANG 164, PERS 4797
Execution times circa: in milliseconds
MS Access: 31 - 63
HSQLDB 1.9.0-rc4: 140 - 234
JavaDB/Derby 10.4.2.0: 31 - 63
H2 1.1.117: timeout exception, i.e. >10000
For me it does not look like "All databases work like this".
For me it looks like "Other databases are working in this case, H2 fails"
If this is "not a bug", i.e. performance without indexing is not a concern then
you
should note that in the documentation.
Maybe with something like this:
"H2 is optimized to perform best with indexing. Without indexing use other
databases,
which work in cases, where H2 fails"
Original comment by gartenzw...@googlemail.com
on 27 Aug 2009 at 8:29
It's good to know that other databases perform better in this case.
I can reproduce this problem up to some point, but by far not as extreme as
with your
data. Could you provide a complete test case please? My test case is (the @LOOP
syntax only works in the H2 Console):
drop table test;
create table test(id int, name varchar(255));
@LOOP 100000 insert into test values(?, 'Hello World');
select * from test t1, test t2, test t3, test t4 where t1.id = 10 and t2.id =
t1.id
and t3.id = t2.id and t4.id = t3.id;
> H2: 3833 ms
> Derby: 271 ms
Original comment by thomas.t...@gmail.com
on 31 Aug 2009 at 5:21
Could you not simply use the originally attached script "issue.sql"? This
creates the
tables with data ("my data"), exactly that data, that I used to make the
comparison
and that fails with H2. It also contains the select statement.
If this is a problem that this script has 1.1MB, then I will try to make
something
simpler like yours.
Original comment by gartenzw...@googlemail.com
on 31 Aug 2009 at 7:14
I have made now a test case, which will show the timeout and is not so big:
SET QUERY_TIMEOUT 20000;
DROP TABLE KONT;
DROP TABLE LANG;
DROP TABLE PERS;
CREATE TABLE KONT (KID varchar(255),PERS_ID varchar(255),TYPE_ID
varchar(255),RIGHT_ID varchar(255));
@LOOP 10 insert into KONT (KID,PERS_ID,TYPE_ID,RIGHT_ID) values
('_4038',trim(char(52275+?)),'KO_2','KO_1');
@LOOP 5926 insert into KONT (KID,PERS_ID,TYPE_ID,RIGHT_ID) values
('_'||trim(char(?+1)),trim(char(50001+?)),'KO_2','KO_1');
CREATE TABLE LANG (LID varchar(255) NOT NULL,SID double,LANG_DE varchar(255));
@LOOP 164 insert into LANG (LID,SID,LANG_DE) values
('KO_'||trim(char(?+1)),?,'--' );
CREATE TABLE PERS (PE_ID varchar(255) NOT NULL,DISP varchar(255));
@LOOP 4797 insert into PERS (PE_ID,DISP) values (trim(char(50001+?)),'-') ;
SELECT pe.DISP,p1.SID, p1.LANG_DE as type, p2.LANG_DE as rightx from KONT ko,
PERS
pe, LANG p1, LANG p2 where ko.KID='_4038' and
ko.PERS_ID=pe.PE_ID and ko.TYPE_ID=p1.LID and ko.RIGHT_ID=p2.LID order by
p1.SID, pe.DISP
I hope this is sufficient for you.
The 'trim(char(' conversion is only needed for derby compatibility. H2 does not
need it.
You can also omit "SET QUERY_TIMEOUT 20000;", but then you will have to wait
long ;)
Original comment by gartenzw...@googlemail.com
on 3 Sep 2009 at 7:06
I found a way to improve performance for such queries. My test case is:
drop table test;
create table test(a int, b int);
@LOOP 20000 insert into test values(?, ?);
select * from test t2, test t1 where t1.a=1 and t1.b = t2.b;
select * from test t1, test t2 where t1.a=1 and t1.b = t2.b;
The fact that your example is fast using HSQLDB is just coincidence it seems:
My test
case is quite slow in HSQLDB.
Anyway, this optimization will be implemented in the next release of H2.
Still, I suggest to create the right indexes.
Original comment by thomas.t...@gmail.com
on 4 Sep 2009 at 1:53
it seems you have found a very simple test case, which shows the bugs.
> Still, I suggest to create the right indexes.
Yes of course .. to maximize performance (by avoiding full table scans) this is
advisable.
I opened this issue because H2 should be also usable (compared with other
databases)
without indexing. With this bug solved, this should be fulfilled.
Thank you for solving.
Original comment by gartenzw...@googlemail.com
on 14 Sep 2009 at 7:03
This should be fixed in version 1.1.119
Original comment by thomas.t...@gmail.com
on 26 Sep 2009 at 11:33
Original issue reported on code.google.com by
gartenzw...@googlemail.com
on 21 Aug 2009 at 9:58Attachments: