LucidDB / luciddb

DEFUNCT: See README
https://github.com/LucidDB/luciddb
Apache License 2.0
52 stars 24 forks source link

[FRG-120] null semantics for IN are broken #750

Open dynamobi-build opened 12 years ago

dynamobi-build commented 12 years ago

[reporter="jvs", created="Mon, 1 May 2006 18:19:28 -0500 (GMT-05:00)"] As part of implementing FRG-108, I punted on null semantics. This will cause trouble for NOT IN and when the IN clause is used in a context where the null semantics are observable (e.g. WHERE (x in (1,2)) is unknown).

dynamobi-build commented 12 years ago

[author="jvs", created="Sun, 22 Oct 2006 22:13:53 -0500 (GMT-05:00)"] You've already taken care of this one, except I need to test with some of the other null-sensitive predicates besides NOT and OR.

dynamobi-build commented 12 years ago

[author="jvs", created="Sun, 22 Oct 2006 22:57:36 -0500 (GMT-05:00)"] OK, below are some IS UNKNOWN examples that cause trouble. Note that the same goes for IS [NOT] NULL, because that is equivalent to IS [NOT] UNKNOWN when applied to a boolean. Also, the IS [NOT] DISTINCT FROM predicate has the same issue, because you can rewrite (predicate IS NOT UNKNOWN) as (predicate IS DISTINCT FROM cast(null as boolean)), etc.

-- setup (using LucidDB)
create schema s;
create table s.t1(i int, j char(1));
insert into s.t1 values (1, 'A'), (2, 'B'), (4, 'C'), (null, 'D');
create table s.t2(m int, n char(1));
insert into s.t2 values (1, 'X'), (2, 'Y'), (3, 'Z');

-- should return D (currently returns empty set)
select j from s.t1
where (i in (select m from s.t2)) is unknown;

-- should return A,B,C (currently returns A,B)
select j from s.t1
where (i in (select m from s.t2)) is not unknown;

-- should return Z (currently returns empty set)
select n from s.t2
where (m in (select i from s.t1)) is unknown;

dynamobi-build commented 12 years ago

[author="zfong", created="Fri, 28 Mar 2008 13:27:47 -0500 (GMT-05:00)"] A couple of observations:

1) The query plans for the 2 queries with the "is unknown" predicate have joins in them, but a join between an empty FennelValuesRel and the table referenced in the subquery. I.e., in the first query, t1 doesn't appear anywhere in the explain plan. That probably explains why these queries return no rows.

2) In order for the result of the 2nd query to include the row where i == 4, you would think that some type of left outer join would be required. Instead, the explain plan only shows a left, semijoin for the hash join.

dynamobi-build commented 12 years ago

[author="jvs", created="Fri, 28 Mar 2008 15:58:36 -0500 (GMT-05:00)"] Note checks for NOT and OR added by Rushan in SqlToRelConverter.findSubqueries. I think the additional 3VL-sensitive operators need to be handled here.