dacort / duckdb-athena-extension

An experimental Athena extension for DuckDB 🐤
MIT License
49 stars 3 forks source link

I am wondering what's the point of this extension #7

Closed MacHu-GWU closed 1 year ago

MacHu-GWU commented 1 year ago

We can just call boto3 api to run athena query and dump the result in parquet, then use DuckDB to do whatever we need. Is there any example this extension can do but athena cannot?

dacort commented 1 year ago

Heh, hi there. 👋

First, this extension was largely an experiment for me to get familiar with Rust and explore the DuckDB extension ecosystem. :)

You can totally just run an Athena query and dump the result in Parquet. That would work fine! It does require you to have an S3 bucket handy to dump your parquet to and figure out the UNLOAD syntax. Not hard, but an extra step. If you're using the boto3 API, that's yet another set of steps that a typical DuckDB user might not want to do.

My idea for this extension was to make it super easy to query data from Athena in DuckDB. As simple as a copy/paste of your existing query. There's definitely a lot of other options as noted in #5, but largely I'm excited about Athena and excited about DuckDB and wanted to experiment. :)