Open Vonng opened 3 weeks ago
Thanks for looking into the build process and for distributing pg_mooncake
with Pigsty! We appreciate the suggestion and have indeed considered using a shared libduckdb.so
, but a few challenges come up with this approach:
Dependency on DuckDB's Internal API: Unlike duckdb_fdw
, pg_mooncake
depends on DuckDB's internal C++ API, which isn't guaranteed to be stable and can change between DuckDB releases. This means pg_mooncake
may not be compatible with arbitrary versions of libduckdb.so
, potentially causing compatibility issues if users install different versions of DuckDB.
Planned Use of Non-Builtin DuckDB Extensions: We're looking to leverage non-builtin DuckDB extensions like delta
and iceberg
to support reading external tables from third-party catalogs. This will require a way to ensure these extensions are consistently available in every installation, which may be harder to manage with a shared libduckdb.so
.
We appreciate the work you've done on RPM/DEB packages and are interested in collaborating in improving the build and distribution process while ensuring compatibility and stability.
There's a significant challenges with the current approach of embedding libduckdb directly.
Compilation Time and Package Size: Embedding libduckdb requires a substantial amount of compilation time and results in a dramatic increase in the size of the package. This problem is exacerbated when considering combinition for 3 PostgreSQL major version and 5 OS distribution.
Conflict with pg_duckdb: The method of embedding libduckdb conflicts with pg_duckdb, forcing users to choose between one or the other. This restriction adds unnecessary adoption difficulties.
To address these concerns, I propose the adoption of a shared libduckdb, similar to the approach used in
duckdb_fdw
available at this Commit.The DuckDB official release provides a binary
libduckdb.so
, and I have already created RPM/DEB packages for EL8/EL9, Ubuntu 22.04/24.04, and Debian 12, which are readily available at [ext.pigsty.io/#/](https://ext.pigsty.io).Adopting a shared libduckdb would mitigate the issues related to compilation times, package sizes, and software conflicts, ultimately simplifying maintenance and user choice.