Make sure you specify the "read" dialect eg. parse_one(sql, read="spark"): Done
Make sure you specify the "write" dialect eg. ast.sql(dialect="duckdb"): Done
Check if the issue still exists on main: still exist
Fully reproducible code snippet
Prepare data:
from sqlglot import parse_one, exp
from sqlglot.optimizer.qualify_columns import qualify_columns
sql ="""with data as (
select 1 as id, 2 as my_id, 'a' as name, 'de' as full_name
)
select
id as my_id,
concat(id, name) as full_name
from data
group by my_id, full_name"""
schema = {
"catalog": {
"schema1": {
"customer": {
"id": "INTEGER",
},
}
}
}
WITH data AS (
SELECT
1 AS id,
2 AS my_id,
'a' AS name,
'de' AS full_name
)
SELECT
data.id AS my_id,
CONCAT(CAST(data.id AS VARCHAR), CAST(data.name AS VARCHAR)) AS full_name
FROM data
GROUP BY
data.my_id,
data.full_name
The problem of the output is it can not be run on trino, because data.id and data.name is not an aggregate expression or appear in GROUP BY clause.
Additional test:
The output will expand correctly if we no not provide schema:
smt = parse_one(sql, read="bigquery")
print(qualify_columns(smt, schema=None).sql(dialect="trino", pretty=True))
"""
WITH data AS (
SELECT
1 AS id,
2 AS my_id,
'a' AS name,
'de' AS full_name
)
SELECT
data.id AS my_id,
CONCAT(CAST(data.id AS VARCHAR), CAST(data.name AS VARCHAR)) AS full_name
FROM data
GROUP BY
data.id,
CONCAT(CAST(data.id AS VARCHAR), CAST(data.name AS VARCHAR))
"""
The output will break again if I disable expand_alias_ref:
Before you file an issue
parse_one(sql, read="spark")
: Doneast.sql(dialect="duckdb")
: DoneFully reproducible code snippet
Prepare data:
Transpiler from bigquery to trino:
Output:
The problem of the output is it can not be run on trino, because
data.id
anddata.name
is not an aggregate expression or appear in GROUP BY clause.Additional test: The output will expand correctly if we no not provide schema:
The output will break again if I disable expand_alias_ref:
Solution
I have made a patch to fix my case, but I think we can have better solution to fix this.
https://github.com/viplazylmht/sqlglot/blob/644bf67720442f964ffc4606693dcfc1b87ec6e6/sqlglot/optimizer/qualify_columns.py#L423-L431
Official Documentation Please include links to official SQL documentation related to your issue.