Fix Test Failure in subquery_in_where, set_operations in PG17 #7741
The test failures are caused by this commit in PG17, which enables correlated subqueries to be pulled up to a join. Prior to this, the correlated subquery was implemented as a subplan. In citus, it is not possible to pushdown a correlated subplan, but with a different plan in PG17 the query can be executed, per the test diff from subquery_in_where:
37,39c37,41
< DEBUG: generating subplan XXX_1 for CTE event_id: SELECT user_id AS events_user_id, "time" AS events_time, event_type FROM public.events_table
< DEBUG: Plan XXX query after replacing subqueries and CTEs: SELECT count(*) AS count FROM (SELECT intermediate_result.events_user_id, intermediate_result.events_time, intermediate_result.event_type FROM read_intermediate_result('XXX_1'::text, 'binary'::citus_copy_format) intermediate_result(events_user_id integer, events_time timestamp without time zone, event_type integer)) event_id WHERE (events_user_id OPERATOR(pg_catalog.=) ANY (SELECT users_table.user_id FROM public.users_table WHERE (users_table."time" OPERATOR(pg_catalog.=) event_id.events_time)))
< ERROR: correlated subqueries are not supported when the FROM clause contains a CTE or subquery
---
> count
> ---------------------------------------------------------------------
> 0
> (1 row)
>
This is because with pg17 = ANY subquery in the queries can be implemented as a join, instead of as a subplan filter on a table scan. For example, SELECT * FROM test a WHERE x IN (SELECT x FROM test b UNION SELECT y FROM test c WHERE a.x = c.x) ORDER BY 1,2 (from set_operations) has this plan in pg17; note that the subquery is the inner side of a nested loop join:
┌───────────────────────────────────────────────────┐
│ QUERY PLAN │
├───────────────────────────────────────────────────┤
│ Sort │
│ Sort Key: a.x, a.y │
│ -> Nested Loop │
│ -> Seq Scan on test a │
│ -> Subquery Scan on "ANY_subquery" │
│ Filter: (a.x = "ANY_subquery".x) │
│ -> HashAggregate │
│ Group Key: b.x │
│ -> Append │
│ -> Seq Scan on test b │
│ -> Seq Scan on test c │
│ Filter: (a.x = x) │
└───────────────────────────────────────────────────┘
and this plan in pg16 (and previous pg versions); the subquery is a correlated subplan filter on a table scan:
┌───────────────────────────────────────────────┐
│ QUERY PLAN │
├───────────────────────────────────────────────┤
│ Sort │
│ Sort Key: a.x, a.y │
│ -> Seq Scan on test a │
│ Filter: (SubPlan 1) │
│ SubPlan 1 │
│ -> HashAggregate │
│ Group Key: b.x │
│ -> Append │
│ -> Seq Scan on test b │
│ -> Seq Scan on test c │
│ Filter: (a.x = x) │
└───────────────────────────────────────────────┘
The fix Modifies the queries causing the test failures so that an ANY subquery is not folded to a join, preserving the expected output of the tests. A similar approach was taken for existing regress tests in the postgres commit. See the joinregress test, for example.
Fix Test Failure in subquery_in_where, set_operations in PG17 #7741
The test failures are caused by this commit in PG17, which enables correlated subqueries to be pulled up to a join. Prior to this, the correlated subquery was implemented as a subplan. In citus, it is not possible to pushdown a correlated subplan, but with a different plan in PG17 the query can be executed, per the test diff from
subquery_in_where
:This is because with pg17
= ANY subquery
in the queries can be implemented as a join, instead of as a subplan filter on a table scan. For example,SELECT * FROM test a WHERE x IN (SELECT x FROM test b UNION SELECT y FROM test c WHERE a.x = c.x) ORDER BY 1,2
(from set_operations) has this plan in pg17; note that the subquery is the inner side of a nested loop join:and this plan in pg16 (and previous pg versions); the subquery is a correlated subplan filter on a table scan:
The fix Modifies the queries causing the test failures so that an ANY subquery is not folded to a join, preserving the expected output of the tests. A similar approach was taken for existing regress tests in the postgres commit. See the
join
regress test, for example.