jOOQ / jOOQ

jOOQ is the best way to write SQL in Java
https://www.jooq.org
Other
6.16k stars 1.21k forks source link

Transform Teradata style SELECT alias references #11820

Open lukaseder opened 3 years ago

lukaseder commented 3 years ago

Teradata bends the rules of the logical order of SQL operations quite a bit, see: https://blog.jooq.org/a-beginners-guide-to-the-true-order-of-sql-operations/

In Teradata, it seems that these are possible:

Referencing column alias from WHERE:

SELECT func(x, y) AS f FROM t WHERE f > 0

This would need to be transformed to either:

-- Substituting expressions
SELECT func(x, y) AS f FROM t WHERE func(x, y) > 0

-- Fixing the order of operations with derived tables
SELECT f FROM (SELECT func(x, y) AS f FROM t) AS t WHERE f > 0

Both approaches can lead to funny edge cases.

Note that it is also possible to do this kind of thing in PostgreSQL and possibly others, when referencing columns from GROUP BY or HAVING by index or name.

Referencing column alias from SELECT:

SELECT func(x, y) AS f, f + 1 AS g FROM t

It seems that a left-to-right name resolution is happening, allowing for aliases to be used on the same level of their declaration. Why not? Again, this can be transformed to either:

-- Substituting expressions
SELECT func(x, y) AS f, func(x, y) + 1 AS g FROM t

-- Fixing the order of operations with derived tables
SELECT f, f + 1 FROM (SELECT func(x, y) AS f FROM t) AS t

Reverse engineered rules:

The interesting bit is that there seems to be some sort of cycle detection when referencing aliases in SELECT:

-- This works
select b as c, a as b, 1 as a

-- This doesn't:
select b as c, a as b, c as a

The last query fails with:

Cannot resolve column 'c'. Specify table or view.

Subqueries:

select 1 as a, a + a, (select (select (select a + a as b)))
where (select (select a + a)) = 2

This works and produces:

|a  |(a+a)|b  |
|---|-----|---|
|1  |2    |2  |

However, b is not in scope for further references, e.g. this doesn't work:

select 1 as a, a + a, (select (select (select a + a as b))), b
where (select (select a + a)) = 2;

Producing:

Cannot resolve column 'b'. Specify table or view.

Despite the projected name being b!

Conflict resolution:

Ambiguous top level projected columns

select 1 as a, a + a, 2 as a

This works, and produces

|a  |(a+a)|a  |
|---|-----|---|
|1  |4    |2  |

So, in such an ambiguous case, only the last definition of a is being considered

Re-declaration of existing column

select 1 as a, a + a
from (select 2 as a) as t

This works as in all other dialects, and produces the standard result:

|a  |(a+a)|
|---|-----|
|1  |4    |

So, while the projected alias is available to ORDER BY like everywhere else, it does not affect same level projection references (or references from WHERE, etc.) if there's already a column by the same name in scope.

Implementation tasks

Dialect support:

lukaseder commented 3 years ago

The resolution of these "lateral column references" requires much more thought:

lukaseder commented 2 years ago

A much more simple, related feature has been implemented:

lukaseder commented 1 year ago

This seems to be a popular request for users who translate SQL away from Teradata et al. I'll reconsider this for jOOQ 3.18. With the current infrastructure, it seems not too hard to implement.

lukaseder commented 1 year ago

DuckDB seems to have implemented something similar.