databrickslabs / remorph

Cross-compiler and Data Reconciler into Databricks Lakehouse
Other
31 stars 20 forks source link

Validate sql code without spinning up a cluster #65

Open nfx opened 10 months ago

nfx commented 10 months ago

Every Transpiled code should be validated for syntax. Options to validate a query?

a) plugin to catalyst sqlparser [!https://stackoverflow.com/questions/46973729/how-to-validate-spark-sql-expression-without-executing-it] b) Optionally use LLMs to validate the code by executing and self healing the transpiled queries for errors.

nsenno-dbr commented 7 months ago

@nfx as an incremental step could we use a serverless SQL warehouse? Right now we are getting feedback that it takes ~5 minutes to spin up a cluster for validation (which is expected in the current configuration)