jackc / pgtype

MIT License
300 stars 111 forks source link

Register a custom type with sqlx #209

Open cuprumtan opened 9 months ago

cuprumtan commented 9 months ago

I have a question about custom types registration. Can you please help me?

I use PostgreSQL with 64-bit XID patch. So this means that the regular pgtype.XID is no longer suitable because it is based on pguint32. I wrote custom pguint64 and XID64 types:

pguint64.go ```go package types import ( "database/sql/driver" "encoding/binary" "errors" "fmt" "strconv" "github.com/jackc/pgio" "github.com/jackc/pgtype" ) // pguint64 is the core type that is used to implement PostgrePro type such as XID. type pguint64 struct { Uint uint64 Status pgtype.Status } // Set converts from src to dst. Note that as pguint64 is not a general // number type Set does not do automatic type conversion as other number // types do. func (dst *pguint64) Set(src interface{}) error { switch value := src.(type) { case int64: if value < 0 { return fmt.Errorf("%d is less than minimum value for pguint64", value) } *dst = pguint64{Uint: uint64(value), Status: pgtype.Present} case uint64: *dst = pguint64{Uint: value, Status: pgtype.Present} default: return fmt.Errorf("cannot convert %v to pguint64", value) } return nil } func (dst pguint64) Get() interface{} { switch dst.Status { case pgtype.Present: return dst.Uint case pgtype.Null: return nil default: return dst.Status } } // AssignTo assigns from src to dst. Note that as pguint64 is not a general number // type AssignTo does not do automatic type conversion as other number types do. func (src *pguint64) AssignTo(dst interface{}) error { switch v := dst.(type) { case *uint64: if src.Status == pgtype.Present { *v = src.Uint } else { return fmt.Errorf("cannot assign %v into %T", src, dst) } case **uint64: if src.Status == pgtype.Present { n := src.Uint *v = &n } else { *v = nil } } return nil } func (dst *pguint64) DecodeText(ci *pgtype.ConnInfo, src []byte) error { if src == nil { *dst = pguint64{Status: pgtype.Null} return nil } n, err := strconv.ParseUint(string(src), 10, 64) if err != nil { return err } *dst = pguint64{Uint: n, Status: pgtype.Present} return nil } func (dst *pguint64) DecodeBinary(ci *pgtype.ConnInfo, src []byte) error { if src == nil { *dst = pguint64{Status: pgtype.Null} return nil } if len(src) != 8 { return fmt.Errorf("invalid length: %v", len(src)) } n := binary.BigEndian.Uint64(src) *dst = pguint64{Uint: n, Status: pgtype.Present} return nil } func (src pguint64) EncodeText(ci *pgtype.ConnInfo, buf []byte) ([]byte, error) { switch src.Status { case pgtype.Null: return nil, nil case pgtype.Undefined: return nil, errors.New("cannot encode status undefined") } return append(buf, strconv.FormatUint(src.Uint, 10)...), nil } func (src pguint64) EncodeBinary(ci *pgtype.ConnInfo, buf []byte) ([]byte, error) { switch src.Status { case pgtype.Null: return nil, nil case pgtype.Undefined: return nil, errors.New("cannot encode status undefined") } return pgio.AppendUint64(buf, src.Uint), nil } // Scan implements the database/sql Scanner interface. func (dst *pguint64) Scan(src interface{}) error { if src == nil { *dst = pguint64{Status: pgtype.Null} return nil } switch src := src.(type) { case uint64: *dst = pguint64{Uint: src, Status: pgtype.Present} return nil case int64: *dst = pguint64{Uint: uint64(src), Status: pgtype.Present} return nil case string: return dst.DecodeText(nil, []byte(src)) case []byte: srcCopy := make([]byte, len(src)) copy(srcCopy, src) return dst.DecodeText(nil, srcCopy) } return fmt.Errorf("cannot scan %T", src) } // Value implements the database/sql/driver Valuer interface. func (src pguint64) Value() (driver.Value, error) { switch src.Status { case pgtype.Present: return src.Uint, nil case pgtype.Null: return nil, nil default: return nil, errors.New("cannot encode status undefined") } } ```
xid64.go ```go package types import ( "database/sql/driver" "github.com/jackc/pgtype" ) // XID is PostgresPro's Transaction ID type. // // In later versions of PostgresPro, it is the type used for the backend_xid // and backend_xmin columns of the pg_stat_activity system view. // // Also, when one does // // select xmin, xmax, * from some_table; // // it is the data type of the xmin and xmax hidden system columns. // // It is currently implemented as an unsigned eight byte integer. type XID64 pguint64 // Set converts from src to dst. Note that as XID64 is not a general // number type Set does not do automatic type conversion as other number // types do. func (dst *XID64) Set(src interface{}) error { return (*pguint64)(dst).Set(src) } func (dst XID64) Get() interface{} { return (pguint64)(dst).Get() } // AssignTo assigns from src to dst. Note that as XID64 is not a general number // type AssignTo does not do automatic type conversion as other number types do. func (src *XID64) AssignTo(dst interface{}) error { return (*pguint64)(src).AssignTo(dst) } func (dst *XID64) DecodeText(ci *pgtype.ConnInfo, src []byte) error { return (*pguint64)(dst).DecodeText(ci, src) } func (dst *XID64) DecodeBinary(ci *pgtype.ConnInfo, src []byte) error { return (*pguint64)(dst).DecodeBinary(ci, src) } func (src XID64) EncodeText(ci *pgtype.ConnInfo, buf []byte) ([]byte, error) { return (pguint64)(src).EncodeText(ci, buf) } func (src XID64) EncodeBinary(ci *pgtype.ConnInfo, buf []byte) ([]byte, error) { return (pguint64)(src).EncodeBinary(ci, buf) } // Scan implements the database/sql Scanner interface. func (dst *XID64) Scan(src interface{}) error { return (*pguint64)(dst).Scan(src) } // Value implements the database/sql/driver Valuer interface. func (src XID64) Value() (driver.Value, error) { return (pguint64)(src).Value() } ```

I can easily register this new XID64 type with pgx and replace the regular one:

db, err := pgx.Connect(context.Background(), str)
if err != nil {
    return nil, fmt.Errorf("connect to database failed: %w", err)
}

db.ConnInfo().RegisterDataType(pgtype.DataType{Value: &types.XID64{}, Name: "xid", OID: pgtype.XIDOID})

But for me it is crucial to use sqlx instead of pgx. Is there any possibility to register type with sqlx?

To avoid misunderstandings, I will indicate the key point of importance of sqlx for me: I need to scan an unspecified number of fields into struct in different cases, so I need sqlx methods which contain destination as interface{}. pgx does not have such functionality.
pgx v5 can be used with scany which has methods for scanning into struct. But with pgtype v5 I cannot scan nullable fields with complex types like inet, xid, etc. and cannot differentiate that a field is NULL or just undefined (like Status in pgtype v4). Or am I wrong about this?

jackc commented 9 months ago

You can use OptionAfterConnect() to get inject custom code into the new connection path with database/sql / stdlib. You could register your new data type there.

cuprumtan commented 9 months ago

I tried to register my type using stdlib:

// Stats provides access to Postgres server statistics through SQL interface.
type Stats struct {
    db      *sqlx.DB
}

// NewFromConnectionString establishes connection to Postgres server using DSN or URI string
// and creates a new Stats object.
//
//nolint:wrapcheck
func NewFromConnectionString(str string) (*Stats, error) {
    connConfig, err := pgx.ParseConfig(str)
    if err != nil {
        return nil, err
    }

    pgxdb := stdlib.OpenDB(*connConfig,
        stdlib.OptionAfterConnect(func(ctx context.Context, conn *pgx.Conn) error {
            conn.ConnInfo().RegisterDataType(pgtype.DataType{Value: &types.XID64{}, Name: "xid", OID: pgtype.XIDOID})

            return nil
        }))

    db := sqlx.NewDb(pgxdb, "pgx")

    s := &Stats{
        db: db,
    }

    return s, nil
}

But when I scan the data, pgtype.XID is still called even though I have explicitly set custom types.XID64. I'm scanning data using db.SelectContext().

jackc commented 9 months ago

I don't know. As far as I can tell that should work. It might be necessary to step through the debugger to find out what's going wrong.

nurlantulemisov commented 9 months ago

Thanks for answer. I have the same problem where OptionAfterConnect only works with custom types, but I want to replace XID with 64xid instead of the current 32xid. I have opened a pull request, but I am unsure if I have done it correctly.

https://github.com/jackc/pgx/pull/1749

jackc commented 9 months ago

So I just had a chance to look closer at this. Do I understand correctly that this 64xid patch actually changes the data type and binary format of a fixed / known OID? 😱

That's ... interesting... And this might end up in core PostgreSQL? 😨 That would seem to make it difficult for a driver or application to reliably handle the xid type when it could be int32 or int64.

Well, I suppose xid is not a commonly used type...


And as you discovered with pgx, the stdlib package essentially hard-codes some builtin types for performance (to be able to use the binary format). But it's hard to imagine that it matters for xid. The simplest solution might actually be to remove all references to xid from the stdlib package.

cuprumtan commented 9 months ago

@jackc yes, this patch changes the data type and binary format of XID. But not for OID. This was done in order to increase the transaction counter to avoid its frequent overflow.

jackc commented 9 months ago

@jackc yes, this patch changes the data type and binary format of XID. But not for OID. This was done in order to increase the transaction counter to avoid its frequent overflow.

Right. What I mean is OIDs are used to identify data types. And OID 28 meaning uint32 in some versions of PostgreSQL and uint64 in others seems very unfortunate.

cuprumtan commented 9 months ago

@jackc yes, this patch changes the data type and binary format of XID. But not for OID. This was done in order to increase the transaction counter to avoid its frequent overflow.

Right. What I mean is OIDs are used to identify data types. And OID 28 meaning uint32 in some versions of PostgreSQL and uint64 in others seems very unfortunate.

Got it! Thanks for the explanation.

nurlantulemisov commented 9 months ago

And this might end up in core PostgreSQL?

I suppose it will be added to the new major version.

That would seem to make it difficult for a driver or application to reliably handle the xid type when it could be int32 or int64.

If this patch is included in the core, we will use only uint64 for xid.


The simplest solution might actually be to remove all references to xid from the stdlib package.

I'm considering using 64xid for my application, but since we're still waiting for the patch to be included in the core, what do you think about removing all references to xid from stdlib?

jackc commented 9 months ago

If this patch is included in the core, we will use only uint64 for xid.

Any given server will only have one type, but pgx would be expected to connect to old and new servers and would have to somehow know what to expect.

what do you think about removing all references to xid from stdlib?

I think that's fair. It would slightly reduce performance, but I can't imagine xid parsing is a significant factor.

cuprumtan commented 9 months ago

but I can't imagine xid parsing is a significant factor.

XID is widely used in monitoring. For example, to collect wraparound metrics based on pg_stat_activity.backend_xid/pg_stat_activity.backend_xmin, pg_database.datfrozenxid. Also for tracking emergency start of autovacuum.

cuprumtan commented 9 months ago

Anyway, I got the answer to my question. I gave examples why it is important for us to work with the xid type. But it’s too early to talk about 64-bit xid until the patch is officially included in PostgreSQL.

To solve my problem I just use pgtype.Text for xid fields and do the conversion to uint64 on the application side.

@jackc @nurlantulemisov thank you!

nurlantulemisov commented 9 months ago

@jackc Why don't you want to replace pgtype to any custom types also in binary format ?

How I can see here we do it.

jackc commented 9 months ago

@nurlantulemisov It might be possible, I'm not sure, but there are definitely a few complications.

A custom type doesn't have to support the binary format. The query sending code would have to be changed to look up if the registered type supported the binary format (and maybe some other checks - database/sql only allows certain types)

It could be slightly slower to remove the hard-coded path.

And this has never come up before in the ~10 year life of pgx. The types that are hard-coded for binary are core types like int. Nobody has ever wanted to change how they are decoded before. I am reluctant to make major changes for a patch to PostgreSQL that may never even be merged.

nurlantulemisov commented 9 months ago

@jackc I got it. It makes sense. Thanks for answer.