PRQL / prql

PRQL is a modern language for transforming data — a simple, powerful, pipelined SQL replacement
https://prql-lang.org
Apache License 2.0
9.79k stars 212 forks source link

ClickHouse prefers aliases to column names #2827

Open eitsupi opened 1 year ago

eitsupi commented 1 year ago

From https://github.com/PRQL/prql/pull/2826#issuecomment-1588871068 and ClickHouse/ClickHouse#50921

In the following SQL, a DB other than ClickHouse (which are used in the tests in this repository) will interpret x in WHERE x == 1 as column x created by the CTE. But ClickHouse interpret x is -x AS x.

WITH _tab AS (
  select 1 AS x
)

SELECT -x AS x
FROM _tab
WHERE x == 1

The behavior can be changed by adding a SETTING clause.

WITH _tab AS (
  select 1 AS x
)

SELECT -x AS x
FROM _tab
WHERE x == 1
SETTINGS prefer_column_name_to_alias = 1

...or exec SET prefer_column_name_to_alias = 1; before querying.

I am wondering if prql-compiler can address this issue. For example, this problem occurs when compiling the following query:

https://github.com/PRQL/prql/blob/3c2860b7756f030841ce74c9960820b61920b919/prql-compiler/tests/integration/queries/genre_counts.prql#L2-L9

max-sixty commented 1 year ago

I think we could consider just adding that to every clickhouse query. It's less elegant than changing the SQL generation to avoid the problem — it means it's on every query.

But until it's more widely used, I would prioritize fixing bugs before adding these sorts of features by dialect (to the extent there's any tradeoff between those — if someone wants to do this then they're ofc welcome...)