marcboeker / go-duckdb

go-duckdb provides a database/sql driver for the DuckDB database engine.
MIT License
623 stars 97 forks source link

feat(appender): Allow typed nils in first row, dereference pointers #168

Closed disq closed 6 months ago

disq commented 6 months ago

Removes the requirement of having non-null columns in the first row, by using typed nils (and dereferencing pointers if necessary)

The check for typed nils (using reflection) may be considered an expensive one, I would prefer if initColTypes was exported, then it can be called separately (optionally) with concrete types if caller knows the first row may contain nulls, then we can remove the typed nils and the check for it (and close this PR without merging)

disq commented 6 months ago

cc @taniabogatsch please can you also take a look?

taniabogatsch commented 6 months ago

Hi @disq! Thanks for the PR - the appender/data chunk types must match the types in the duckdb table. As of duckdb's C API version 0.10., we can remove inferring the types from the first row altogether. @maiadegraaf opened a respective issue here. This might prove cleaner than the current solution, and I'd also prefer it to your changes. What do you think?

DUCKDB_API idx_t duckdb_appender_column_count(duckdb_appender appender);
DUCKDB_API duckdb_logical_type duckdb_appender_column_type(duckdb_appender appender, idx_t col_idx);

After creating the appender, you must first retrieve the column count. Then, for each column, you retrieve the type. Then, you have all the necessary types to initialize the data chunks.

Let me know if you pick this up. Otherwise, I'll eventually have a go at it.

maiadegraaf commented 6 months ago

Hi @disq, I also considered your approach when writing the appender but in my experience it only works when the nil being passed has already been explicitly been typed.

In the following case we still wouldn't be able to determine the type:

err = appender.AppendRow(nil)

So I'm inclined to agree with @taniabogatsch that the best way to allow nils in the first row would be to use the new CAPI functions, which would allow us to bypass determining types from the first row all together.

disq commented 6 months ago

@taniabogatsch @maiadegraaf Thanks for your input. Yeah, requiring typed nils also require some kind of set up in the caller (as typed nils are considered uncommon) I'd prefer the new approach as well. I don't think I can pick this up this week, will let you know if I do.

Closing PR, but I'll hang onto the branch in my local for now.

taniabogatsch commented 6 months ago

PR is up.