added support to duckdb via ADBC

matreyes commented 6 months ago

In my role as Data Engineer I've been moving out from dataframes (aka spark) as I'm building everything with DBT and BigQuery and locally with DuckDB (CLI). Using just sql to build different stages (kind of delta lake), has been a great way to facilitate the work of data scientists and for data governance.

I can see how DuckDB could work as the Ingestion and transformation layer to expose a simplified and maybe materialized "data marts" to explorer or vegalite. This is so powerful!

Amazing work with ADBC !

josevalim commented 6 months ago

Thank you @matreyes! I have some questions to understand your use case better:

Why not use Explorer/DataFrames for the data manipulation?
Do you actually have a Duck DB in disk?
How are you currently importing data into Duck DB to work with it?

matreyes commented 6 months ago

Answering your questions

Why not use Explorer/DataFrames for the data manipulation? I currently work for a big corp with lot of tools and languages, the only common language between eng, scientists and analyst is sql. Also persistence (views or tables) could be useful and I can reuse DuckDB transformations in our DW (BigQuery) later on with few changes.
Do you actually have a Duck DB in disk? I use both (memory and disk). In disk is great for having a kind of local "datamart"; in memory resembles more a typical dataframe workflow. Also "Running on a persistent database allows spilling to disk, thus facilitating larger-than-memory workloads (i.e., out-of-core-processing)."
How are you currently importing data into Duck DB to work with it? I usually import csv or parquet files directly to duckdb

josevalim commented 6 months ago

Thank you! This looks good to me. If you are happy with it, we can merge it.

josevalim commented 6 months ago

To be clear, let me know if it is ready by saying yes or no :)

matreyes commented 6 months ago

Thanks @jonatanklosko , It's OK from my side.

josevalim commented 6 months ago

:green_heart: :blue_heart: :purple_heart: :yellow_heart: :heart:

livebook-dev / kino_db

added support to duckdb via ADBC #71