dimagi / django-cte

Common Table Expressions (CTE) for Django
Other
334 stars 46 forks source link

Allow non-recursive CTEs to avoid optimization fence #31

Open henribru opened 3 years ago

henribru commented 3 years ago

In Postgres 12 the default behavior of CTEs changed. They're no longer materialized if they are non-recursive, side-effect free and only appear once in the query: https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=608b167f9f9c4553c35bb1ec0eab9ddae643989b

django-cte makes all CTEs recursive: https://github.com/dimagi/django-cte/blob/master/django_cte/query.py#L85

It would be nice to have the option to have non-recursive CTEs to avoid them being optimization fences.

millerdev commented 2 years ago

After reviewing the documentation I am unable to determine if the RECURSIVE keyword has any affect on the optimizations introduced in Postgres 12.

if a WITH query is non-recursive and side-effect-free (that is, it is a SELECT containing no volatile functions) then it can be folded into the parent query, allowing joint optimization of the two query levels. By default, this happens if the parent query references the WITH query just once, but not if it references the WITH query more than once. You can override that decision by specifying MATERIALIZED to force separate calculation of the WITH query, or by specifying NOT MATERIALIZED to force it to be merged into the parent query.

The RECURSIVE keyword is only allowed once immediately after the WITH keyword, while there may be multiple comma-delimited CTEs that follow, some of which may not be recursive. It is unclear whether the presence of the RECURSIVE keyword causes all CTEs in the WITH block to be marked as recursive by the query planner, or if a deeper analysis (something like does the CTE contain a self-referential UNION query?) is done on each CTE in the WITH block to determine if the query is recursive or not.

WITH RECURSIVE s(m) AS (
  SELECT 0  -- is this considered recursive?
), t(n) AS (
  SELECT 1
  UNION ALL
  SELECT n+1 FROM t  -- this certainly is recursive
)
SELECT m, n FROM s, t LIMIT 10;

It seems reasonable to interpret the phrase "WITH query is non-recursive" as applying to a single CTE within a WITH RECURSIVE block since later in that same paragraph it says the optimization decision can be overridden "by specifying MATERIALIZED to force separate calculation of the WITH query" which clearly applies to a single query within a block that may contain more than one.

The SQLite documentation is more explicit, although there MATERIALIZED and NOT MATERIALIZED are hints that may be ignored by the query planner. Unlike Postgres, the RECURSIVE keyword is not required, even when using rCTEs.

The SQL:1999 spec requires that the RECURSIVE keyword follow WITH in any WITH clause that includes a recursive common table expression. However, for compatibility with SqlServer and Oracle, SQLite does not enforce this rule.