intuit / superglue

Superglue is a lineage-tracking tool built to help visualize the propagation of data through complex pipelines composed of tables, jobs and reports.
Apache License 2.0
153 stars 37 forks source link

Enable SQL dialect config for SQL parser #30

Closed lingyv-li closed 3 years ago

lingyv-li commented 3 years ago

Changes:

  1. Upgrade Calcite from 1.18 to 1.26 to get dialect.configureParser method.
    1. Also upgrade elastic4s from 6.5.1 to 6.5.7 to fix jackson minor version incompatibility.
  2. Add optional string argument dialect
    1. to FileInputConfig and ScriptInput along the path.
    2. to reporting consumer.
    3. to StatementParser.parseStatement method.
  3. In CalciteStatementParser read dialect as one of the DatabaseProduct. Use the dialect to configure parser config if exists. Otherwise fallback to previous config (MYSQL_5).

TODO:

Close #23

lingyv-li commented 3 years ago

Ready to be reviewed. @sambekar15 Regarding documenting it, the internal document is not copied here yet. So I am posting the documentation here and you can help adding it.

dialect

This is an optional tag that marks the dialect of scripts which are being discovered in this block. All files which are matched using the includes and excludes rules in this block will be marked as this "dialect" of script. This is later passed to the parser implementations.

For SQL scripts when a dialect is provided, it must be one of DatabaseProduct. If a dialect is not provided, the default Oracle Lex with MYSQL_5 mode will be used.

sambekar15 commented 3 years ago

Ready to be reviewed. @sambekar15 Regarding documenting it, the internal document is not copied here yet. So I am posting the documentation here and you can help adding it.

dialect

This is an optional tag that marks the dialect of scripts which are being discovered in this block. All files which are matched using the includes and excludes rules in this block will be marked as this "dialect" of script. This is later passed to the parser implementations.

For SQL scripts when a dialect is provided, it must be one of DatabaseProduct. If a dialect is not provided, the default Oracle Lex with MYSQL_5 mode will be used. Thanks lingyv-li for creating this PR. i'll get to it soon this week and provide you feedback.

lingyv-li commented 3 years ago

Hi @sambekar15 Have you got a chance to review?

lingyv-li commented 3 years ago

@sambekar15 done!