duckdb / postgres_scanner

https://duckdb.org/docs/extensions/postgres
MIT License
229 stars 36 forks source link

Read native parquet files from Postgres using DuckDB #240

Closed sushrut141 closed 2 months ago

sushrut141 commented 3 months ago

I'm the author of the pg_analytica extension that enables periodic parquet exports for Postgres tables. See: https://github.com/sushrut141/pg_analytica

The extension currently relies on the parquet_fdw extension to query the parquet files. If DuckDB implements it's own Foreign Data Wrapper for Postgres then querying would be greatly sped up compared to parquet_fdw since DuckDB optimiser is known for better pipelining / vectorisation of queries.

I was wondering if this is something we could collaborate on and package the two extensions as a complete analytics package for Postgres instances. I would be happy to contribute to the creation of this FDW if you can share some pointers.

Thanks

sushrut141 commented 2 months ago

HI, friendly ping. Can you please comment on the feasibility of the above? Thanks

Mytherin commented 2 months ago

Thanks for reaching out! What you are describing sounds instead quite similar to pg_duckdb, which we have been working on in collaboration with a few other parties. Feel free to send a PR there if you are interested in contributing!