databrickslabs / remorph

Cross-compiler and Data Reconciler into Databricks Lakehouse
Other
47 stars 30 forks source link

[LCA]: Ability replace Lateral Alias with actual column name and Expression #1222

Open sundarshankar89 opened 6 days ago

sundarshankar89 commented 6 days ago

Is there an existing issue for this?

Category of feature request

Transpile

Problem statement

Given the following Snowflake SQL:

SELECT column_a as customer_id
FROM mytable
WHERE customer_id = '123'

The following Spark SQL should be returned because of the different alias scoping used in Spark SQL

SELECT column_a as customer_id
FROM mytable
WHERE column_a = '123'

https://www.databricks.com/blog/introducing-support-lateral-column-alias

Proposed Solution

NA

Additional Context

No response

sundarshankar89 commented 6 days ago

@ericvergnaud this might be an issue you can tackle.

sundarshankar89 commented 6 days ago

more such examples are present under lca_utils.py on how we tackle we haven't covered all combinations but you can refer how we do preprocessing using python prototype.

ericvergnaud commented 6 days ago

@sundarshankar89 re the output, did you mean:

SELECT column_a as customer_id
FROM mytable
WHERE column_a = '123'

?

sundarshankar89 commented 6 days ago

Just edited the description.

Also checkout test_lca_utils.py it has few more examples with LCA in sourcesystem.