marcboeker / go-duckdb

go-duckdb provides a database/sql driver for the DuckDB database engine.
MIT License
583 stars 96 forks source link

Initial DataChunk API support #233

Closed taniabogatsch closed 1 month ago

taniabogatsch commented 1 month ago

This PR introduces a new DataChunk API. See the discussion here. Ideally, all interaction between go-duckdb and duckdb's data chunks eventually goes through this API without compromising performance. As an outlook, in #219, I further used this API for go-duckdb's scanner.

As a first step, this PR moves all interaction between the Appender and data chunks to the DataChunk API and its setter functions, further decreasing the complexity of the Appender code.

This PR adds the following functions:

Additionally, I reverted the column index in the data chunk when returning an unsupported type error message to 0-based indexing.

Questions

ajzo90 commented 1 month ago

Nice! A few suggestions on the public interface.

Perhaps InitFromTypes and SetSize should be private methods (for now)? InitFromTypes because it's using internal arguments and SetSize since it's likely not necessary to expose publicly?

Destroy can also be private until we have cases where it needs to be public. If/when it goes public I think the signature should be Close() error to follow go naming conventions.

taniabogatsch commented 1 month ago

@ajzo90, good call, I've made them private. Do you have any other suggestions?

JAicewizard commented 1 month ago

Small question, is there a reason projection support wasn't added? If not could this still be added via another PR? and if yet should we make a new type to support projection?

JAicewizard commented 1 month ago

Also for table functions SetSize is must be exposed, as a tablefunction might not want to fill the entire chunk, I will expose this in my PR. (Same for GetSize)