Open skada-coder opened 5 months ago
Describe the bug When parsing a UNION clause using * and the MetadataProvider class it fails to resolve subsequent source columns in the UNION clause.
Below is a simplified example.
SQL Paste the SQL text here. For example:
CREATE TABLE MAIN.FOOBAR AS ( SELECT * FROM MAIN.BAR UNION SELECT * FROM MAIN.BAR2 UNION SELECT * FROM MAIN.BAR3 )
Column Lineage fails to resolve columns from MAIN.BAR2 and MAIN.BAR3 but MAIN.BAR is fine
To Reproduce
from sqllineage.core.metadata.dummy import DummyMetaDataProvider from sqllineage.runner import LineageRunner metadata = { "main.bar": ["col1", "col2"], "main.bar2": ["col1", "col2"], "main.bar3": ["col1", "col2"] } provider = DummyMetaDataProvider(metadata) sql1 = """ CREATE TABLE MAIN.FOOBAR AS ( SELECT * FROM MAIN.BAR UNION SELECT * FROM MAIN.BAR2 UNION SELECT * FROM MAIN.BAR3 ) """ LineageRunner(sql1, metadata_provider=provider).print_column_lineage()
actual output is
main.foobar.col1 <- main.bar.col1 main.foobar.col2 <- main.bar.col2
I'd expect it to return this instead
main.foobar.col1 <- main.bar.col1 main.foobar.col2 <- main.bar.col2 main.foobar.col1 <- main.bar2.col1 main.foobar.col2 <- main.bar2.col2 main.foobar.col1 <- main.bar3.col1 main.foobar.col2 <- main.bar3.col2
Python version (available via python --version)
python --version
SQLLineage version (available via sqllineage --version):
sqllineage --version
Note that this issue does not occur in the non case where the columns are all explicit in the UNION clause so looks like it only impacts scenario.
We noticed the issue using the snowflake dialect for our own production use cases
Describe the bug When parsing a UNION clause using * and the MetadataProvider class it fails to resolve subsequent source columns in the UNION clause.
Below is a simplified example.
SQL Paste the SQL text here. For example:
Column Lineage fails to resolve columns from MAIN.BAR2 and MAIN.BAR3 but MAIN.BAR is fine
To Reproduce
actual output is
I'd expect it to return this instead
Python version (available via
python --version
)SQLLineage version (available via
sqllineage --version
):Note that this issue does not occur in the non case where the columns are all explicit in the UNION clause so looks like it only impacts scenario.
We noticed the issue using the snowflake dialect for our own production use cases