citusdata / citus

Distributed PostgreSQL as an extension
https://www.citusdata.com
GNU Affero General Public License v3.0
10.63k stars 670 forks source link

PG17 compatibility: Fix Test Failure in local_dist_join_mixed #7731

Open m3hm3t opened 1 week ago

m3hm3t commented 1 week ago

PostgreSQL 16 adds an extra condition (id IS NOT NULL) to the subquery. This condition is likely used to ensure that no null values are processed in the subquery. Instead of using the condition id IS NOT NULL, PostgreSQL 17 generates the subplan with a trivial condition (WHERE true), indicating that it does not need to explicitly check for non-null values.

PostgreSQL 17 likely includes optimizations to handle null checks more efficiently. The WHERE (id IS NOT NULL) condition that was present in PostgreSQL 16 may now be considered redundant by the planner, as it is implicitly handled by the query execution engine.

https://github.com/postgres/postgres/commit/b262ad44

 SELECT
        foo1.id
    FROM
 (SELECT local.id, local.title FROM local, distributed WHERE local.id = distributed.id ) as foo9,
 (SELECT local.id, local.title FROM local, distributed WHERE local.id = distributed.id ) as foo8,
 (SELECT local.id, local.title FROM local, distributed WHERE local.id = distributed.id ) as foo7,
 (SELECT local.id, local.title FROM local, distributed WHERE local.id = distributed.id ) as foo6,
 (SELECT local.id, local.title FROM local, distributed WHERE local.id = distributed.id ) as foo5,
 (SELECT local.id, local.title FROM local, distributed WHERE local.id = distributed.id ) as foo4,
 (SELECT local.id, local.title FROM local, distributed WHERE local.id = distributed.id ) as foo3,
 (SELECT local.id, local.title FROM local, distributed WHERE local.id = distributed.id ) as foo2,
 (SELECT local.id, local.title FROM local, distributed WHERE local.id = distributed.id ) as foo10,
 (SELECT local.id, local.title FROM local, distributed WHERE local.id = distributed.id ) as foo1
 WHERE
  foo1.id =  foo9.id AND
  foo1.id =  foo8.id AND
  foo1.id =  foo7.id AND
  foo1.id =  foo6.id AND
  foo1.id =  foo5.id AND
  foo1.id =  foo4.id AND
  foo1.id =  foo3.id AND
  foo1.id =  foo2.id AND
  foo1.id =  foo10.id AND
  foo1.id =  foo1.id
ORDER BY 1;
...
-DEBUG:  generating subplan XXX_10 for subquery SELECT id FROM local_dist_join_mixed.local WHERE (id IS NOT NULL)
+DEBUG:  generating subplan XXX_10 for subquery SELECT id FROM local_dist_join_mixed.local WHERE true
...
codecov[bot] commented 1 week ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 89.60%. Comparing base (b29c332) to head (0a8e7ab).

Additional details and impacted files ```diff @@ Coverage Diff @@ ## naisila/pg17_support #7731 +/- ## ======================================================== - Coverage 89.61% 89.60% -0.01% ======================================================== Files 274 274 Lines 59689 59689 Branches 7446 7446 ======================================================== - Hits 53490 53486 -4 - Misses 4069 4072 +3 - Partials 2130 2131 +1 ```
naisila commented 2 days ago

Here we can simply remove the redundant filter because it wasn't the original intention of this test.

 SELECT
        foo1.id
    FROM
 (SELECT local.id, local.title FROM local, distributed WHERE local.id = distributed.id ) as foo9,
 (SELECT local.id, local.title FROM local, distributed WHERE local.id = distributed.id ) as foo8,
 (SELECT local.id, local.title FROM local, distributed WHERE local.id = distributed.id ) as foo7,
 (SELECT local.id, local.title FROM local, distributed WHERE local.id = distributed.id ) as foo6,
 (SELECT local.id, local.title FROM local, distributed WHERE local.id = distributed.id ) as foo5,
 (SELECT local.id, local.title FROM local, distributed WHERE local.id = distributed.id ) as foo4,
 (SELECT local.id, local.title FROM local, distributed WHERE local.id = distributed.id ) as foo3,
 (SELECT local.id, local.title FROM local, distributed WHERE local.id = distributed.id ) as foo2,
 (SELECT local.id, local.title FROM local, distributed WHERE local.id = distributed.id ) as foo10,
 (SELECT local.id, local.title FROM local, distributed WHERE local.id = distributed.id ) as foo1
 WHERE
  foo1.id =  foo9.id AND
  foo1.id =  foo8.id AND
  foo1.id =  foo7.id AND
  foo1.id =  foo6.id AND
  foo1.id =  foo5.id AND
  foo1.id =  foo4.id AND
  foo1.id =  foo3.id AND
  foo1.id =  foo2.id AND
  foo1.id =  foo10.id AND
-  foo1.id =  foo1.id
ORDER BY 1;