jackc / pgx

PostgreSQL driver and toolkit for Go
MIT License
10.52k stars 832 forks source link

Support passing type that implements sql.Scanner by double pointer #1000

Open georgysavva opened 3 years ago

georgysavva commented 3 years ago

Hello! With the current pgx version I have the following problem. Imagine you have a type that implements the sql.Scanner interface, e.g. uuid.UUID from the "github.com/google/uuid" package, and I want to scan into that type from a NULLable column. I assume that in order for pgx to handle the NULL case, I need to pass my type as a double-pointer, i.e. **uuid.UUID, so would nil value represent the NULL:

// import "github.com/google/uuid"
var id *uuid.UUID

// pgxDB, _ := pgxpool.Connect(...)
if err := pgxDB.QueryRow(ctx, `SELECT NULL as id`).Scan(&id); err != nil {
    fmt.Printf("1st query error: %s\n", err)
} else {
    fmt.Printf("1st query result: %s\n", id)
}
if err := pgxDB.QueryRow(ctx, `SELECT '347e7e54-881a-4583-9f86-b2aa9558f9ac' as id`).Scan(&id); err != nil {
    fmt.Printf("2nd query error: %s\n", err)
} else {
    fmt.Printf("2nd query result: %s\n", id)
}

But the following code prints an error for the second query:

1st query result: <nil>
2nd query error: can't scan into dest[0]: unable to assign to *[16]uint8

It's surprising because if I run the same scenario but on the database/sql and pgx as a driver, it succedes:

// import "github.com/google/uuid"
var id *uuid.UUID

// sqlDB := stdlib.OpenDB(...)
if err := sqlDB.QueryRow(`SELECT NULL as id`).Scan(&id); err != nil {
    fmt.Printf("3nd query error: %s\n", err)
} else {
    fmt.Printf("3nd query result: %s\n", id)
}
if err := sqlDB.QueryRow(`SELECT '347e7e54-881a-4583-9f86-b2aa9558f9ac' as id`).Scan(&id); err != nil {
    fmt.Printf("4th query error : %s\n", err)
} else {
    fmt.Printf("4th query result: %s\n", id)
}

The following code prints no errors for both queries:

3nd query result: <nil>
4th query result: 347e7e54-881a-4583-9f86-b2aa9558f9ac

This error also happens for all types from the pgtype package.

jackc commented 3 years ago

Works for me.

package main

import (
    "context"
    "log"
    "os"

    "github.com/google/uuid"
    "github.com/jackc/pgx/v4"
)

func main() {
    conn, err := pgx.Connect(context.Background(), os.Getenv("DATABASE_URL"))
    if err != nil {
        log.Fatal(err)
    }
    defer conn.Close(context.Background())

    var id *uuid.UUID
    err = conn.QueryRow(context.Background(), "select gen_random_uuid()").Scan(&id)
    if err != nil {
        log.Fatal(err)
    }

    log.Println("Random UUID", id)

    err = conn.QueryRow(context.Background(), "select null::uuid").Scan(&id)
    if err != nil {
        log.Fatal(err)
    }

    log.Println("NULL UUID", id)
}
jack@glados ~/dev/pgx_issues/pgx-1000 ±master⚡ » go run main.go
2021/05/08 07:55:08 Random UUID 053d532e-302f-4579-90d7-f217d9c06056
2021/05/08 07:55:08 NULL UUID <nil>
georgysavva commented 3 years ago

Thanks for providing a working example. I managed to find the reason why my code didn't work: ::uuid type hint is missing in the SQL query. If I add it everything is fine with that example:

// import "github.com/google/uuid"
var id *uuid.UUID

// pgxDB, _ := pgxpool.Connect(...)
if err := pgxDB.QueryRow(ctx, `SELECT NULL::uuid as id`).Scan(&id); err != nil {
    fmt.Printf("1st query error: %s\n", err)
} else {
    fmt.Printf("1st query result: %s\n", id)
}
if err := pgxDB.QueryRow(ctx, `SELECT '347e7e54-881a-4583-9f86-b2aa9558f9ac'::uuid as id`).Scan(&id); err != nil {
    fmt.Printf("2nd query error: %s\n", err)
} else {
    fmt.Printf("2nd query result: %s\n", id)
}

// sqlDB := stdlib.OpenDB(...)
if err := sqlDB.QueryRow(`SELECT NULL::uuid as id`).Scan(&id); err != nil {
    fmt.Printf("3nd query error: %s\n", err)
} else {
    fmt.Printf("3nd query result: %s\n", id)
}
if err := sqlDB.QueryRow(`SELECT '347e7e54-881a-4583-9f86-b2aa9558f9ac'::uuid as id`).Scan(&id); err != nil {
    fmt.Printf("4th query error : %s\n", err)
} else {
    fmt.Printf("4th query result: %s\n", id)
}

Prints:

1st query result: <nil>
2nd query result: 347e7e54-881a-4583-9f86-b2aa9558f9ac
3nd query result: <nil>
4th query result: 347e7e54-881a-4583-9f86-b2aa9558f9ac

But let's consider something other than a uuid, e.g. plain text. So instead of *uuid.UUID I am going to use *pgtype.Text:

// import "github.com/jackc/pgtype"
var text *pgtype.Text

// pgxDB, _ := pgxpool.Connect(...)
if err := pgxDB.QueryRow(ctx, `SELECT NULL as text`).Scan(&text); err != nil {
    fmt.Printf("1st query error: %s\n", err)
} else {
    fmt.Printf("1st query result: %v\n", text)
}
if err := pgxDB.QueryRow(ctx, `SELECT 'foo' as text`).Scan(&text); err != nil {
    fmt.Printf("2nd query error: %s\n", err)
} else {
    fmt.Printf("2nd query result: %v\n", text)
}

// sqlDB := stdlib.OpenDB(...)
if err := sqlDB.QueryRow(`SELECT NULL as text`).Scan(&text); err != nil {
    fmt.Printf("3nd query error: %s\n", err)
} else {
    fmt.Printf("3nd query result: %v\n", text)
}
if err := sqlDB.QueryRow(`SELECT 'foo' as text`).Scan(&text); err != nil {
    fmt.Printf("4th query error : %s\n", err)
} else {
    fmt.Printf("4th query result: %v\n", text)
}

This example misses the type hint in the SQL query (because it shouldn't require one for a plain text column and even if add ::text it doesn't help in that case) and it fails on the second query again:

1st query result: <nil>
2nd query error: can't scan into dest[0]: unable to assign to *pgtype.Text
3nd query result: <nil>
4th query result: &{foo 2}

BTW the situation is the same if I define a custom type Text:

type Text struct {
    ...
}

func (t *Text) Scan(src interface{}) error {
    ...
}

And use it instead of pgtype.Text in my queries.

jackc commented 3 years ago

I'm not sure what the best solution is here. Because **T does not implement the pgx or database/sql interfaces the value is read into the registered type (pgtype.Text in this case) and it tries to assign it with AssignTo. But obviously it doesn't know anything about **T.

Either each type's AssignTo would need to have logic to reflect on the destination or the reflection would have to be done before AssignTo. Neither is good. The former would require a lot of extra logic in AssignTo and the latter could have a negative performance impact.

idaunis commented 3 years ago

Hello, the main issue we are having is with pointers of structs wrapping google.UUID or any custom UUID. For example if we do this:


package main

import (
    "context"
    "fmt"

    "github.com/google/uuid"
    "github.com/jackc/pgtype"
    "github.com/jackc/pgx/v4"
)

type OtherCustomUUID struct {
    uuid.UUID
}

type CustomUUID struct {
    UUID
}

type UUID string

func main() {
    ctx := context.Background()
    conn, err := pgx.Connect(context.Background(), os.Getenv("DATABASE_URL"))
    if err != nil {
        fmt.Println(err)
        return
    }
    defer conn.Close(ctx)

    var id *pgtype.Text

    err = conn.QueryRow(context.Background(), "SELECT gen_random_uuid()").Scan(&id)
    if err != nil {
        fmt.Println(err)
    }
    fmt.Println("Random UUID", id)

    err = conn.QueryRow(context.Background(), "SELECT NULL::uuid as id").Scan(&id)
    if err != nil {
        fmt.Println(err)
    }
    fmt.Println("Null UUID", id)

    var id2 *CustomUUID

    err = conn.QueryRow(context.Background(), "SELECT gen_random_uuid()").Scan(&id2)
    if err != nil {
        fmt.Println(err)
    }
    fmt.Println("Random UUID", id2)

    err = conn.QueryRow(context.Background(), "SELECT NULL::uuid as id").Scan(&id2)
    if err != nil {
        fmt.Println(err)
    }
    fmt.Println("Null UUID", id2)

    var id3 *OtherCustomUUID

    err = conn.QueryRow(context.Background(), "SELECT gen_random_uuid()").Scan(&id3)
    if err != nil {
        fmt.Println(err)
    }
    fmt.Println("Random UUID", id3)

    err = conn.QueryRow(context.Background(), "SELECT NULL::uuid as id").Scan(&id3)
    if err != nil {
        fmt.Println(err)
    }
    fmt.Println("Null UUID", id3)
}

Output:

can't scan into dest[0]: cannot assign &{[13 224 29 92 54 192 71 110 146 240 45 228 121 91 131 177] 2} into *pgtype.Text
Random UUID &{ 0}
Null UUID <nil>
can't scan into dest[0]: cannot assign &{[185 81 146 156 245 115 67 53 164 105 19 185 90 123 28 143] 2} into *main.CustomUUID
Random UUID &{}
Null UUID <nil>
can't scan into dest[0]: cannot assign &{[219 208 107 150 110 57 77 246 165 7 70 72 43 125 254 120] 2} into *main.OtherCustomUUID
Random UUID 00000000-0000-0000-0000-000000000000
Null UUID <nil>

I propose this https://github.com/jackc/pgtype/pull/104 solution on pgtype. Let me know what you think.

georgysavva commented 3 years ago

Is there any progress with that issue? I would really want it to be addressed. It's turned out to be very confusing for users when they work with pgx via scany library https://github.com/georgysavva/scany, because they can't define types the same way for pgx and database/sql. Consider an example:

type Text struct {
    s string
}

func (t *Text) Scan(src interface{}) error {
    t.s = src.(string)
    return nil
}

type User struct {
    ID1 *Text
    ID2 *Text
}

var user1 User
// pgxDB, _ := pgxpool.Connect(...)
if err := pgxscan.Get(ctx, pgxDB, &user1, `SELECT NULL as id1, 'foo' as id2`); err != nil {
    fmt.Printf("pgxscan query error: %s\n", err)
} else {
    fmt.Printf("pgxscan query result: %v\n", user1)
}

var user2 User
// sqlDB := stdlib.OpenDB(...)
if err := sqlscan.Get(ctx, sqlDB, &user2, `SELECT NULL as id1, 'foo' as id2`); err != nil {
    fmt.Printf("sqlscan query error: %s\n", err)
} else {
    fmt.Printf("sqlscan query result: %v\n", user2)
}

Prints the following:

pgxscan query error: scany: scan row into struct fields: can't scan into dest[1]: unable to assign to *main.Text
sqlscan query result: {ID1:<nil> ID2:0xc000290720}

So the main point is that if you have a struct (Text) that implements the sql.Scanner interface and it can be NULLable, you can't define it by a pointer to handle the NULL case as you would do it with database/sql library.

Let me know if you need any help with that issue!

jackc commented 3 years ago

Sorry, I haven't had further opportunity for looking at this. The type system is already is complicated by a bunch of special cased beyond the simple idea of a scan target must implement an interface. It takes me a while to refamiliarize myself with it when I need to work with it. And I'm concerned if this type change would put reflection in a hot path.

There may be another solution in the long term though. I'm toying with the idea of a new major release to target Go 1.18+. Generics could greatly simplify pgtype (e.g. null handling and arrays). But along with that I'm considering alternate ways of handling types, scanning, and encoding. Possibly some sort of pluggable middleware layer. This layer would allow mappings between Go and PostgreSQL types without requiring any particular type to implement any particular interface.

georgysavva commented 3 years ago

I see. It's great to hear that there are ways to fix this issue in the future! But since there are no timelines for that I will try to implement a fix (like an exception) on the scany side until the issue is addressed in pgx. Please keep us updated. Thank you!

pkieltyka commented 3 years ago

hi all, I'm running into this issue as well. Recently I built https://github.com/goware/pgkit which is a light sugar on a combo of pgx+scany+squirrel and it's been working great. Its specifically built on the pgx driver directly and doesn't use the stdlib database/sql adapter. https://github.com/goware/pgkit/pull/6/files#r675962030 is a PR and link to specific issue. You'll notice the the PR I've begun to implement the pgx custom type methods, but these are just stubbed in / disabled for now. Ideally I could just use the Scanner/Valuer methods.

The error I receive is "scany: scan row into struct fields: can't scan into dest[3]: unable to assign to *dbtype.BigInt" -- https://github.com/goware/pgkit/pull/6/files#r675962214

From the comments above, it seems if I change pgkit to move to using database/sql adapter, then my code should work? Since this is such a common use for me of having a nullable struct custom type, I may have no choice but to do this. So I was wondering if there is another solution, or what are the trade-offs from using pgx vs database/sql wrapper in terms of performance / other? Or is the wrapper just a change of api, but the performance benefits are the same since pgx is used under the hood? thank you


finally if anyone is interested to run the pgkit test, just clone the repo, switch to to bigint-scan-issue branch, and run:

  1. make db-create -- this will make the pgkit_test db
  2. make test -- this will run all tests