Select executes very long or infinite or until timeout

GoogleCodeExporter commented 9 years ago

This Issue is described in group H2-database:
http://groups.google.com/group/h2-database/browse_thread/thread/9123e6b750f17a04
#

What steps will reproduce the problem?
(simple SQL scripts or simple standalone applications are preferred)
1. create the database/tables with script 
2. execute select statement (executes fast)
3. insert 1 row in table (within script)
4. reexecute (the same) select statement (times out!)

What is the expected output?
both selects should show the same resultset (8 rows) with same/comparable
execution times

What do you see instead?
the 2. Select times out after the defined time limit (10000ms)
Error: Statement was canceled or the session timed out; SQL statement:
..
SQLState:  90051
ErrorCode: 90051

What version of the product are you using? h2-1.1.117.jar
On what operating system  XP
file system NTFS  
and virtual machine?  Sun 1.6.0_16-b01

Do you know a workaround? No 

How important/urgent is the problem for you?
Prevents migrating to H2

In your view, is this a defect or a feature request?
defect

Please provide any additional information below.
script is attached

Regards charly

Original issue reported on code.google.com by gartenzw...@googlemail.com on 21 Aug 2009 at 9:58

Attachments:

issue.sql

GoogleCodeExporter commented 9 years ago

This is interesting. The query befoer the insert gives this plan:

SELECT PE.DISP, P1.SID, P1.LANG_DE AS TYPE, P2.LANG_DE AS RIGHT
FROM PUBLIC.KONT KO /* PUBLIC.KONT_TABLE_SCAN */ /* WHERE KO.KID = '_4038' */
INNER JOIN PUBLIC.PERS PE /* PUBLIC.PERS_TABLE_SCAN */ ON 1=1 /* WHERE 
KO.PERS_ID =
PE.PE_ID */
INNER JOIN PUBLIC.LANG P1 /* PUBLIC.LANG_TABLE_SCAN */ ON 1=1 /* WHERE 
KO.TYPE_ID =
P1.LID */
INNER JOIN PUBLIC.LANG P2 /* PUBLIC.LANG_TABLE_SCAN */ ON 1=1
WHERE (KO.RIGHT_ID = P2.LID) AND ((KO.TYPE_ID = P1.LID) AND ((KO.KID = '_4038') 
AND
(KO.PERS_ID = PE.PE_ID)))
ORDER BY 2, 1

And after the insert gives this plan:

SELECT PE.DISP, P1.SID, P1.LANG_DE AS TYPE, P2.LANG_DE AS RIGHT
FROM PUBLIC.PERS PE /* PUBLIC.PERS_TABLE_SCAN */
INNER JOIN PUBLIC.LANG P1 /* PUBLIC.LANG_TABLE_SCAN */ ON 1=1
INNER JOIN PUBLIC.LANG P2 /* PUBLIC.LANG_TABLE_SCAN */ ON 1=1
INNER JOIN PUBLIC.KONT KO /* PUBLIC.KONT_TABLE_SCAN */ ON 1=1
WHERE (KO.RIGHT_ID = P2.LID) AND ((KO.TYPE_ID = P1.LID) AND ((KO.KID = '_4038') 
AND
(KO.PERS_ID = PE.PE_ID)))
ORDER BY 2, 1

If you create an index on *any* column on *any* table, it will always generate 
the
first plan and run real fast. Drop the index, slow. Add the index, fast.

The fact that all the tables are lacking any indexes or primary key info 
probably has
the optimizer going down some pessimal corner case.

It's a bug, but it seems pretty minor; personally I can't get *too* worked up 
over
queries that are problematic on tables with no indexing information.

Chris

Original comment by cscha...@gmail.com on 22 Aug 2009 at 3:32

GoogleCodeExporter commented 9 years ago

To make sure that no false impression arises: This is not a simple thing like: 
query
with indexing is fast, query without indexing is slow:

1. before inserting this single row (i.e. without bug) the select runs also 
fast (on
my 6 year old PC 40ms) without any indexing
2. with this single row inserted (i.e. with bug effective) the select is not 
only
slow: it never gets ready in practical time.
   (I waited 1 hour). That means it is unusable in this use case.

I am using H2 as a test/development database. In this use case its not 
necessary that
the database performs with maximal achievable performance.
Because of this and because the DB is not that big I had not to deal with 
indexing.

Maybe this is a "pretty minor" bug for DB architects, who do not execute any 
select
without creating an appropriate index first.
But there are surely other users (like me), who simply want to use H2 without 
dealing
with indexing first (of course with not too big databases).
And H2 is capable to operate fast, even without indexing (see 1. above)

Ok, now that I know how to workaround this I can live with it, regardless it is 
not a
"pretty minor" bug for me.

Charly

Original comment by gartenzw...@googlemail.com on 23 Aug 2009 at 11:30

GoogleCodeExporter commented 9 years ago

Not a bug. All databases work like this.

Original comment by thomas.t...@gmail.com on 25 Aug 2009 at 10:12

Changed state: Invalid

GoogleCodeExporter commented 9 years ago

> Not a bug. All databases work like this.

Lets have a look at some other databases, how they perform in this case.
The same tables and data from the attached script:

Executing 2. Select from script (minor modificated):
SELECT pe.DISP,p1.SID, p1.LANG_DE as type, p2.LANG_DE as rightx from KONT ko, 
PERS
pe, LANG p1, LANG p2 where ko.KID='_4038' and 
ko.PERS_ID=pe.PE_ID and ko.TYPE_ID=p1.LID and ko.RIGHT_ID=p2.LID order by 
p1.SID, pe.DISP

Table  KONT has 5936 rows, LANG 164, PERS 4797

Execution times circa: in milliseconds
MS Access:                 31 -  63
HSQLDB 1.9.0-rc4:         140 - 234
JavaDB/Derby  10.4.2.0:    31 -  63
H2  1.1.117:              timeout exception, i.e. >10000

For me it does not look like "All databases work like this".
For me it looks like "Other databases are working in this case, H2 fails"

If this is "not a bug", i.e. performance without indexing is not a concern then 
you
should note that in the documentation.
Maybe with something like this:
"H2 is optimized to perform best with indexing. Without indexing use other 
databases,
which work in cases, where H2 fails"

Original comment by gartenzw...@googlemail.com on 27 Aug 2009 at 8:29

GoogleCodeExporter commented 9 years ago

It's good to know that other databases perform better in this case.
I can reproduce this problem up to some point, but by far not as extreme as 
with your
data. Could you provide a complete test case please? My test case is (the @LOOP
syntax only works in the H2 Console):

drop table test;
create table test(id int, name varchar(255));
@LOOP 100000 insert into test values(?, 'Hello World');
select * from test t1, test t2, test t3, test t4 where t1.id = 10 and t2.id = 
t1.id
and t3.id = t2.id and t4.id = t3.id;

> H2: 3833 ms
> Derby: 271 ms

Original comment by thomas.t...@gmail.com on 31 Aug 2009 at 5:21

Changed state: Accepted

GoogleCodeExporter commented 9 years ago

Could you not simply use the originally attached script "issue.sql"? This 
creates the
tables with data ("my data"), exactly that data, that I used to make the 
comparison
and that fails with H2. It also contains the select statement.

If this is a problem that this script has 1.1MB, then I will try to make 
something
simpler like yours.

Original comment by gartenzw...@googlemail.com on 31 Aug 2009 at 7:14

GoogleCodeExporter commented 9 years ago

I have made now a test case, which will show the timeout and is not so big:

SET QUERY_TIMEOUT 20000;

DROP TABLE KONT;
DROP TABLE LANG;
DROP TABLE PERS;

CREATE TABLE KONT (KID varchar(255),PERS_ID varchar(255),TYPE_ID
varchar(255),RIGHT_ID varchar(255));

@LOOP 10 insert into KONT (KID,PERS_ID,TYPE_ID,RIGHT_ID) values
('_4038',trim(char(52275+?)),'KO_2','KO_1');
@LOOP 5926 insert into KONT (KID,PERS_ID,TYPE_ID,RIGHT_ID) values
('_'||trim(char(?+1)),trim(char(50001+?)),'KO_2','KO_1');

CREATE TABLE LANG (LID varchar(255) NOT NULL,SID double,LANG_DE varchar(255));

@LOOP 164 insert into LANG (LID,SID,LANG_DE) values 
('KO_'||trim(char(?+1)),?,'--' );

CREATE TABLE PERS (PE_ID varchar(255) NOT NULL,DISP varchar(255));

@LOOP 4797 insert into PERS (PE_ID,DISP) values (trim(char(50001+?)),'-') ;

SELECT pe.DISP,p1.SID, p1.LANG_DE as type, p2.LANG_DE as rightx from KONT ko, 
PERS
pe, LANG p1, LANG p2 where ko.KID='_4038' and 
ko.PERS_ID=pe.PE_ID and ko.TYPE_ID=p1.LID and ko.RIGHT_ID=p2.LID order by 
p1.SID, pe.DISP

I hope this is sufficient for you.
The 'trim(char(' conversion is only needed for derby compatibility. H2 does not 
need it.
You can also omit "SET QUERY_TIMEOUT 20000;", but then you will have to wait 
long ;)

Original comment by gartenzw...@googlemail.com on 3 Sep 2009 at 7:06

GoogleCodeExporter commented 9 years ago

I found a way to improve performance for such queries. My test case is:

drop table test;
create table test(a int, b int);
@LOOP 20000 insert into test values(?, ?);
select * from test t2, test t1 where t1.a=1 and t1.b = t2.b;
select * from test t1, test t2 where t1.a=1 and t1.b = t2.b;

The fact that your example is fast using HSQLDB is just coincidence it seems: 
My test
case is quite slow in HSQLDB.

Anyway, this optimization will be implemented in the next release of H2.
Still, I suggest to create the right indexes.

Original comment by thomas.t...@gmail.com on 4 Sep 2009 at 1:53

GoogleCodeExporter commented 9 years ago

it seems you have found a very simple test case, which shows the bugs.

> Still, I suggest to create the right indexes.
Yes of course .. to maximize performance (by avoiding full table scans) this is
advisable.
I opened this issue because H2 should be also usable (compared with other 
databases)
without indexing. With this bug solved, this should be fulfilled.

Thank you for solving.

Original comment by gartenzw...@googlemail.com on 14 Sep 2009 at 7:03

GoogleCodeExporter commented 9 years ago

This should be fixed in version 1.1.119

Original comment by thomas.t...@gmail.com on 26 Sep 2009 at 11:33

Changed state: Fixed

lbehnke / h2database

Select executes very long or infinite or until timeout #113