marcboeker / go-duckdb

go-duckdb provides a database/sql driver for the DuckDB database engine.
MIT License
680 stars 103 forks source link

Move row scanning int the data chunk interface #240

Closed taniabogatsch closed 3 months ago

taniabogatsch commented 3 months ago

This PR expands the existing DataChunk interface with GetValue. I've moved all the scanning functionality out of rows.go and into the respective getter functions. The getter function matches the fnGetVectorValue type: type fnGetVectorValue func(vec *vector, rowIdx C.idx_t) any. During data chunk initialization, we now initialize both the setter and getter functions.

A possible optimization for multiple (same result) data chunks is to reuse the existing initialized vectors and change the duckdb vector and data. Even without that optimization, this PR should speed up scanning. I've run the BenchmarkTypes benchmark both on main and this PR (locally).

Benchmark main PR
BenchmarkTypes-10 341,039,886 ns/op 271,966,542 ns/op
marcboeker commented 3 months ago

@taniabogatsch Thanks for the improvement.