Canner / WrenAI

🚀 An open-source SQL AI (Text-to-SQL) Agent that empowers data, product teams to chat with their data. 🤘
https://getwren.ai/oss
GNU Affero General Public License v3.0
2.06k stars 215 forks source link

Feature: Databricks SQL Database/Lakehouse support #322

Open jtrangel opened 5 months ago

jtrangel commented 5 months ago

Would love to see a connector or integration over Databricks. Not sure if it will be tricky since some of the data objects (tables/views) in Databricks aren't actually within the databricks filesystem. Others are just referenced externally from object storage (ADLS/S3).

wwwy3y3 commented 5 months ago

Hi @jtrangel , thanks for the feature request.

... Not sure if it will be tricky since some of the data objects (tables/views) in Databricks aren't actually within the databricks filesystem. Others are just referenced externally from object storage (ADLS/S3).

For object storage, you can use DuckDB (https://docs.getwren.ai/guide/connect/duckdb) to query files (ex: csv, parquet, json).

Would love to see a connector or integration over Databricks. ...

Additionally, we're actually working on using the ibis-project to connect to the datasource. The concept is that WrenAI generates ibis-executable SQL, which is then transformed to the specific datasource using the appropriate dialect. You can see how we invoke the ibis-project in the Wren Engine repo branch. It's still a work in progress.

Thus, for databricks support, we'll see if ibis already has integration. If it does, we could easily put that into our roadmap after refactoring to ibis.

wwwy3y3 commented 5 months ago

Feel free to upvote here https://github.com/Canner/WrenAI/discussions/327#discussioncomment-9578188 and share more about your use cases.