marcboeker / go-duckdb

go-duckdb provides a database/sql driver for the DuckDB database engine.
MIT License
583 stars 96 forks source link

Use build tags for Arrow implementation #205

Closed ayuhito closed 1 month ago

ayuhito commented 2 months ago

Closes #204.

I tested the build flags on my project and saw around ~2MB shaved off the final binary, which is a solid success. Including the duckdb_arrow build tag reverted it back to the old size, so it seems that the tag is working.

I'm not too sure what the breaking change policy is for this library, so I'm happy to make appropriate changes. However, I strongly believe features like this should be opt-in, rather than opt-out.

taniabogatsch commented 1 month ago

Hi @ayuhito, I've discussed this with @marcboeker. While we agree that opt-in is more sensible for features like the arrow support, the appender, and upcoming features like UDFs, we do not want to break the current version. Can this PR be changed to support easy opt-out for these and keep the default build as-is? Then, once go-duckdb moves to a v2, we can swap to opt-in. What do you think?

ayuhito commented 1 month ago

Hi @taniabogatsch. That makes perfect sense. I'll update the PR to use no_duckdb_arrow instead.

For clarity, I'm not sure if offering build tag options for the appender and upcoming UDF features will make a noticeable difference to the code since they are small snippets that do not introduce any new dependencies. I'll verify if omitting appender.go makes a difference, but I'm suspecting it would be unnecessary.

Also regarding V2, may I suggest using a similar pattern mentioned in the edit (although it being a slightly more annoying migration path for users). It eliminates the need for build tags and enables proper deadcode elimination to take place from the Go linker side. It also handles the appender and UDF cases too, however small the gains may be.

taniabogatsch commented 1 month ago

Great, thanks! Let's focus on opt-out for arrow in this PR, then.

Your point w.r.t. the minimal impact for the appender, etc. is fair. With that in mind, making them opt-out (and eventually opt-in) would only create a logical separation between driver functionality (database/sql) and additional features. 🤔 Which will be much cleaner by introducing your pattern suggestion.

ayuhito commented 1 month ago

Conflicts should be resolved now. 👍 Didn't realize the GitHub Actions workflows that I was tinkering with in a separate branch will affect this PR.