marcboeker / go-duckdb

go-duckdb provides a database/sql driver for the DuckDB database engine.
MIT License
583 stars 96 forks source link

Make Apache Arrow Optional #204

Closed ayuhito closed 1 month ago

ayuhito commented 2 months ago

134 added support for Apache Arrow but introduces a pretty heavy dependency in the process. Analysing my binaries using gsa, github.com/apache/arrow/go/v14 introduced an additional ~800kb to my binary alone. In fact, that dependency also imports github.com/goccy/go-json which introduces an additional ~750kb which seems excessive if you do not use Arrow.

Would it be possible to introduce build tags and make this feature optional instead?

ayuhito commented 2 months ago

Perhaps dead code elimination doesn't really work since we import the database like this?

import (
    _ "github.com/marcboeker/go-duckdb"
)

Edit: I've seen other database implementations use modules to specify imports instead to get around it as an alternative to build tags. For example:

import (
    _ "github.com/marcboeker/go-duckdb/driver"
        _ "github.com/marcboeker/go-duckdb/embed"
        "github.com/marcboeker/go-duckdb/arrow"
)
marcboeker commented 2 months ago

Thanks for bringing this up. We'll have a look at it. Maybe we can include the Arrow and Appender interface by default (as it is right now) and add build tags to exclude them.