apache / cloudberry

One advanced and mature open-source MPP (Massively Parallel Processing) database. Open source alternative to Greenplum Database.
https://cloudberry.apache.org
Apache License 2.0
418 stars 104 forks source link

WIP: ORCA: Fix eliminate self comparison #722

Open fanfuxiaoran opened 1 day ago

fanfuxiaoran commented 1 day ago

For the below query

create table t1(a int, b int not null);
create table t2(like t1);
select t1.*, t2.* from t1 full join t2 on false where (t1.b < t1.b) is null;

orca generates a wrong plan:

Result  (cost=0.00..0.00 rows=0 width=16)
    One-Time Filter: false
Optimizer: Pivotal Optimizer (GPORCA)

The root cause is '(t1.b < t1.b)' is been transformed into 'CScalarConst (0)' by 'PexprEliminateSelfComparison'. The reason is that when checking if the selfcomparison can be simplified by function FSelfComparison, it checks the CColRef IsNullable only from the column definition, not checking if the column is from outer join.

To fix it, before simplifing the scalar expression, we fisrt get the 'pcrsNotNull' from its parent expression. 'pcrsNotNull' recoreds the output columns' nullable property. If the column is not in 'pcrsNotNull', then the self comparison cannot be transformed into const true or false.

Fixes #594

What does this PR do?

Type of Change

Breaking Changes

Test Plan

Impact

Performance:

User-facing changes:

Dependencies:

Checklist

Additional Context

⚠️ To skip CI: Add [skip ci] to your PR title. Only use when necessary! ⚠️


fanfuxiaoran commented 1 day ago

Working on adding tests for it!

fanfuxiaoran commented 22 hours ago

Found that gpdb has a similar commit : https://github.com/greenplum-db/gpdb-archive/commit/d3dd98c1a8daf04fbf6cb91fc4afa6f91b317e93 Just cherry-pick it . But it has some problems, added a commit to fix it.