marcboeker / go-duckdb

go-duckdb provides a database/sql driver for the DuckDB database engine.
MIT License
623 stars 97 forks source link

Initialise column types during appender creation #171

Closed taniabogatsch closed 6 months ago

taniabogatsch commented 6 months ago

As discussed in #168, this PR adds support for initializing the column types of a data chunk when creating an appender.

The PR builds on https://github.com/marcboeker/go-duckdb/pull/167. I'll undraft it once #167 is merged.

taniabogatsch commented 6 months ago

I added a map to track all as-of-now unsupported types in the appender. We can incrementally add these (if they make sense in Go).

disq commented 6 months ago

@taniabogatsch in appendRowArray, the untyped nil check (quoted below) doesn't seem to be enough for nullable list items.

        if v == nil {
            setNull(&info, a.currSize)
            continue
        }

To be able to append a list with a null value in it, the slice has to be of type []*T. But typeMatch won't allow it:

type mismatch for column 29: expected: []int64, actual: []*int64

select column_name, ordinal_position, is_nullable, data_type from information_schema.columns where ordinal_position=30;
┌─────────────┬──────────────────┬─────────────┬───────────┐
│ column_name │ ordinal_position │ is_nullable │ data_type │
│   varchar   │      int32       │   varchar   │  varchar  │
├─────────────┼──────────────────┼─────────────┼───────────┤
│ id_list     │               30 │ YES         │ BIGINT[]  │
└─────────────┴──────────────────┴─────────────┴───────────┘
D 

Here's a test to show what I mean, I'm investigating possible solutions to this.

func TestAppenderNullBasicList(t *testing.T) {
    connector, con, appender := prepareAppender(t, `CREATE TABLE test (int_slice bigint[])`)
    defer con.Close()
    defer connector.Close()

    // An empty list must also initialize the logical types.
    err := appender.AppendRow([]int64{})
    require.NoError(t, err)

    err = appender.AppendRow([]int64{1, 2, 3})
    require.NoError(t, err)

    err = appender.AppendRow(nil)
    require.NoError(t, err)

    p := func(i int64) *int64 {
        return &i
    }

    err = appender.AppendRow([]*int64{p(1), p(2), nil, p(4)})
    require.NoError(t, err)

    err = appender.Close()
    require.NoError(t, err)

    // Verify results.
    db := sql.OpenDB(connector)
    res, err := db.QueryContext(
        context.Background(),
        `SELECT int_slice FROM test`)
    require.NoError(t, err)
    defer res.Close()

    var strResult []string
    strResult = append(strResult, "[]")
    strResult = append(strResult, "[1 2 3]")
    strResult = append(strResult, "<nil>")
    strResult = append(strResult, "[1 2 <nil> 4]")

    i := 0
    for res.Next() {
        var strS string
        var intS []any
        err := res.Scan(
            &intS,
        )
        if err != nil {
            strS = "<nil>"
        } else {
            strS = fmt.Sprintf("%v", intS)
        }

        require.Equal(t, strResult[i], strS, fmt.Sprintf("row %d: expected %v, got %v", i, strResult[i], strS))
        i++
    }
}
taniabogatsch commented 6 months ago

Also, cc @marcboeker, what is your take on allowing (nil)-pointers as types in the appender? I understand it is an elegant solution to better support NULL types for now. Is there another way? We could potentially also move to the database/sql package NULL types. 🤔

That is, we focus on column type initialization in this PR. Then, we can approach the whole NULL topic in a separate, dedicated PR. As I already noticed other tests breaking, if I extend them to this solution. cc @disq ?

marcboeker commented 6 months ago

167 is merged now 🙂

disq commented 6 months ago

Also, cc @marcboeker, what is your take on allowing (nil)-pointers as types in the appender? I understand it is an elegant solution to better support NULL types for now. Is there another way? We could potentially also move to the database/sql package NULL types. 🤔

That is, we focus on column type initialization in this PR. Then, we can approach the whole NULL topic in a separate, dedicated PR. As I already noticed other tests breaking, if I extend them to this solution. cc @disq ?

@taniabogatsch Since Appender isn't really a database/sql interface, it's OK (even encouraged?) to keep it separate from the rest of database/sql (sql.Null* types). But the appender is using other concepts from database/sql (does it? except driver.Conn, which is casted to *conn immediately) then it would make sense to port everything to sql.Null*.

taniabogatsch commented 6 months ago

@disq - fair point! I'll disentangle this PR to focus on the type creation/initialization. Then, I'll open a separate PR and port my NULL changes to that one. That way, we do not mix both, and I can focus on adding a few more tests for the NULLs. Eventually, we can discuss better sql.Null* type support in the appender/go-duckdb, as raised in https://github.com/marcboeker/go-duckdb/issues/89.

taniabogatsch commented 6 months ago

I am closing this to favor #177 and another (soon-to-be) PR for better NULL support.