duckdb / community-extensions

https://duckdb.org/community_extensions
143 stars 26 forks source link

Add `sheetreader` as community extension #134

Closed freddie-freeloader closed 1 month ago

freddie-freeloader commented 1 month ago

Hi!

In the last semester, I was part of a programming project organized by the DIMA group at TU Berlin. We created a small DuckDB-extension named sheetreader that utilizes sheetreader-core (a fast multi-threaded XLSX parser) for importing XLSX files into DuckDB.

We did a few benchmarks comparing our extension to the import function which the spatial extension provides (st_read). Our first benchmarks indicate, that depending on several factors the sheetreader extension is around 5 to 10 times faster than the spatial extension at parsing XLSX files and loading them into DuckDB (https://github.com/polydbms/sheetreader-duckdb/?tab=readme-ov-file#benchmarks).

We would like to offer this extension as a DuckDB community extension.

A note regarding the repository structure of our extension:

Let me know if there are adjustments to be made. :slightly_smiling_face: