jackc / pgx

PostgreSQL driver and toolkit for Go
MIT License
10.72k stars 837 forks source link

pgtype: driver.Valuer support for Array or Map #1662

Open muhlemmer opened 1 year ago

muhlemmer commented 1 year ago

Is your feature request related to a problem? Please describe.

We use stdlib pgx in our project. For unit tests we use sqlmock for unit tests. We want to migrate to v5 and the related v5/pgtype sub-package from the stand-alone pgtype package.

sqlmock doesn't like slice types such as []int, []string etc. As a solution we have some wrapper types like:

type StringArray []string

// Scan implements the [database/sql.Scanner] interface.
func (s *StringArray) Scan(src any) error {
    array := new(pgtype.TextArray)
    if err := array.Scan(src); err != nil {
        return err
    }
    if err := array.AssignTo(s); err != nil {
        return err
    }
    return nil
}

// Value implements the [database/sql/driver.Valuer] interface.
func (s StringArray) Value() (driver.Value, error) {
    if len(s) == 0 {
        return nil, nil
    }

    array := pgtype.TextArray{}
    if err := array.Set(s); err != nil {
        return nil, err
    }

    return array.Value()
}

I'm aware that pgx/stdlib support direct passing of a slice as parameter value, but sqlmock does not. Hence we need to implement the driver.Valuer interface like above. The new pgtype.Array does not implement driver.Valuer anymore.

I've read the conversation in https://github.com/jackc/pgx/issues/1458. It pointed out, the Map type can be used for scanning into arrays, which is fine for that case.

Describe the solution you'd like It would be awesome if one of the following were to be implemented:

  1. pgtype.Array gets a Value() method.
  2. pgtype.Map gets a SQLValuer() or similar method.

Describe alternatives you've considered

I'm aware this is not a pgx issue necessarily, as we are the ones swapping to another (mock) driver. sqlmock provides the possibility to use a driver.ValueConverter instead. As some discussions over there suggest, the driver might already export a ValueConverter which then can be used. However, I was not able to find anything in the pgx project.

  1. If there is a ValueConverter somewhere and I missed it, I will gladly use it. 2 Implement my own ValueConverter as a last resort.

Additional context none

muhlemmer commented 1 year ago

As a workaround this works:

type StringArray []string

// Scan implements the [database/sql.Scanner] interface.
func (s *StringArray) Scan(src any) error {
    var dst []string
    if err := pgtype_v5.NewMap().SQLScanner(&dst).Scan(src); err != nil {
        return err
    }
    *s = dst
    return nil
}

// Value implements the [database/sql/driver.Valuer] interface.
func (s StringArray) Value() (driver.Value, error) {
    if len(s) == 0 {
        return nil, nil
    }
    src := pgtype_v5.FlatArray[string](s)
    codec := pgtype_v5.ArrayCodec{
        ElementType: &pgtype_v5.Type{
            Name:  "text",
            OID:   pgtype_v5.TextOID,
            Codec: pgtype_v5.TextCodec{},
        },
    }
    buf, err := codec.PlanEncode(pgtype_v5.NewMap(), pgtype_v5.TextArrayOID, pgtype_v5.TextFormatCode, src).Encode(src, nil)
    if err != nil {
        return nil, err
    }
    return string(buf), err
}

But it's a bit verbose.

muhlemmer commented 1 year ago

The following generic version seems to meet most of my demands. Perhaps something similar can be done to the pgtype package? I'm willing to send a PR.

var pgTypeMap = pgtype_v5.NewMap()

type Array[T any] []T

func (a *Array[T]) Scan(src any) error {
    var dst []T
    if err := pgTypeMap.SQLScanner(&dst).Scan(src); err != nil {
        return err
    }
    *a = Array[T](dst)
    return nil
}

func (a Array[T]) Value() (driver.Value, error) {
    if len(a) == 0 {
        return nil, nil
    }
    src := pgtype_v5.FlatArray[T](a)
    arrayType, ok1 := pgTypeMap.TypeForValue(src)
    elementType, ok2 := pgTypeMap.TypeForValue(src[0])
    if !ok1 || !ok2 {
        return nil, fmt.Errorf("can't encode %T", src)
    }
    codec := pgtype_v5.ArrayCodec{ElementType: elementType}

    buf, err := codec.PlanEncode(pgTypeMap, arrayType.OID, pgtype_v5.TextFormatCode, src).Encode(src, nil)
    if err != nil {
        return nil, err
    }
    return string(buf), err
}
jackc commented 1 year ago

pgtype.Array gets a Value() method.

I don't think this is feasible. The generic approach for Array[T] would only work for builtin types unless custom types were also registered on pgTypeMap. This would basically mean a singleton for types. While it is rare to connect to multiple databases in the same program, it is possible, and this approach would not work in that case (not to mention potential concurrency issues).

pgtype.Map gets a SQLValuer() or similar method.

I haven't investigated to see if this would be practical, but at first glance I think it is a good idea. And I like the symmetry with Map.SQLScanner.


I think it would also be possible to use Array[T] as a building block to build non-generic array types like v4 had without too much ceremony. It might be worth looking into that as well.

muhlemmer commented 1 year ago

not to mention potential concurrency issues

Yes, the race detector already slapped me on the wrist for that.

I think it would also be possible to use Array[T] as a building block to build non-generic array types like v4 had without too much ceremony. It might be worth looking into that as well.

I read in some issue that v4 used a generator. What tool did you use for generation? It might be worth to port it for v5 array types, building on Array[T].

jackc commented 1 year ago

I read in some issue that v4 used a generator. What tool did you use for generation? It might be worth to port it for v5 array types, building on Array[T].

I'm a long time Ruby developer so I was using erb. May or may not be best for the project as a whole, but it was what was most convenient for me at the time.

A few thoughts on this though. 1. If we add concrete types I think it would only make sense in the stdlib package, not the pgtype package as it is only necessary when using stdlib. 2. Depending on how many types and how much boilerplate there is it might be worth manually duplicating the logic instead of adding a build step for templating.

nissim-natanov-outreach commented 1 year ago

This is a great demo code, thanks for the tip.

var pgTypeMap = pgtype_v5.NewMap()

Please note that the pgtype.Map instance is not safe for concurrent use and it will almost immediately fire a data race violation if used concurrently since the Map uses memoizedScanPlans plans like one here: https://github.com/jackc/pgx/blob/d626dfe94e250e911b77ac929e2be06f81042bd0/pgtype/pgtype.go#L223

To make the map concurrency safe, one may need to either add RWLock around these plans or wrap the whole Map instance with a plain Lock for all access which would make things slow.

muhlemmer commented 1 year ago

Please note that the pgtype.Map instance is not safe for concurrent use and it will almost immediately fire a data race violation if used concurrently since the Map uses memoizedScanPlans plans like one

Yes, I noticed also but I didn't update above snippet as I want to implement a better more permanent solution that can be merged with upstream.

muhlemmer commented 1 year ago

A few thoughts on this though. 1. If we add concrete types I think it would only make sense in the stdlib package, not the pgtype package as it is only necessary when using stdlib. 2. Depending on how many types and how much boilerplate there is it might be worth manually duplicating the logic instead of adding a build step for templating.

@jackc I wouldn't mind sending PRs against the stlib package. I would probably first commit to a TextArray first, as this is the most needed for us and later see for other types, or others might want to chip in where they need array support.

jackc commented 1 year ago

👍

It be interesting to experiment with a few different approaches before committing to any one. I still don't see any alternative to an array type for each underlying type, but it bothers me that we can't find a generic approach.

... actually, I wonder if this would be feasible?

type Element interface {
    sql.Scanner
    driver.Valuer
}

type Array[T Element] []T

I think that should be implementable for any Element without needing a pgtype.Map.

muhlemmer commented 1 year ago

Guys, can you please stop from hijacking this thread? The topic is to discuss a new feature request and any code I posted is just for understanding the concept and for the sake of discussion. It is not provided so that people can use it and then start asking questions why it doesn't work as they thought it would.

Thanks.

zakcutner commented 6 months ago

Hello! I was wondering if there's been any more progress on this? I think that even an SQLValuer() method on pgtype.Map, as discussed above, would be very useful to make this simpler.

jackc commented 5 months ago

Candidate solution: https://github.com/jackc/pgx/pull/2020