Open radeusgd opened 1 year ago
Comparing query plans for the example query:
T1
twice in the 'naive' example and only once with CTE (db-fiddle).33b2ff7105b6ce95be3ba87ca805304de354496f Adds manual "let-binding"-style CTEs for manually simplifying generated SQL, and uses it to greatly reduce the size of the SQL generated for non-builtin round
for backends that support nested WITH...AS
queries. This is a good temporary measure and can be used freely whenever it improves query size, but it is not a sufficient solution.
Currently we are combining queries by including the 'ingredients' of a query as subqueries. That works, but in some cases may cause duplication - making the queries larger than necessary - thus harder to read for humans and optimize for DBs.
Motivating example:
With the current design this will lead to a query like (simplified):
As we can see the expression (which in some cases can be a much larger and more complicated query, possibly itself consisting of many subqueries)
SELECT *, Y+100 AS X FROM T0 WHERE Z == 20 AND W LIKE ? AS T1
is duplicated in the subqueries.This is undesirable. Of course the database optimizer may detect that the subqueries are the same, but it is less likely then if they just referred to the same CTE. Moreover, the query itself is larger, thus making it just more complex and also harder to read for humans. In fact, worst case scenario if we do something like:
the query size will grow exponentially (as will the result size). Whereas, if we optimize the subqueries using CTEs, the query will only grow linearly.
A proposed solution would be to modify the SQL generator to prefer CTEs over subqueries and ensure that common subtrees are compiled to a single CTE. Then the first example from above could be compiled to something like:
Which is much more structured and easier to read - and includes the (possibly more complex) expression for
T1
only once in the query.