Open krahnikblis opened 2 months ago
This feature is known as multiple insert clauses (also known as Multi Table Insert). It's not standard ANSI SQL syntax but a Hive extension. See https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML#LanguageManualDML-Syntax.1
SparkSQL also supports this syntax according to its antlr g4 file, although it's not documented: https://github.com/apache/spark/blob/v3.5.1/sql/api/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBaseParser.g4#L444
We need sqlfluff to support parsing this before we can add lineage analysis support. Upstream issue raise: https://github.com/sqlfluff/sqlfluff/issues/5866
Describe the bug when processing SQL scripts where the grammar is backwards,
LineageRunner().target_tables
fails to parse/fix.SQL Paste the SQL text here. For example:
To Reproduce Note here we refer to SQL provided in prior step as stored in a file named
test.sql
Expected behavior processing the SQL as normal, or perhaps (better) put the
FROM
segment afterSELECT
[where it belongs; i know i know i didn't write the SQL just trying to organized and make sense of other peoples' stuff]Python version (available via
python --version
)SQLLineage version (available via
sqllineage --version
):Additional context it looks like this library sub-packages sqlfluff within its parser? i didn't know where to log the issue, but decided here since the errors above show
\site-packages\sqllineage\
for all traces. separately, i do have sqlfluff 2.3.5 installed, i don't know if this means i have 2 versions or what...