tinylib / msgp

A Go code generator for MessagePack / msgpack.org[Go]
MIT License
1.79k stars 190 forks source link

Support for go generics #327

Open mredolatti opened 1 year ago

mredolatti commented 1 year ago

Hi! are there any plans to support generic structs? Something like this:

type Person struct {
    Name string
}

type Response[T any] struct {
    Status int
    Payload T
}

type PersonResponse Response[Person]

will try to generate code as if Response was indeed a struct and instead of a template.

image

Thanks!

gudron commented 1 year ago

codegen + generics ... srsly? @mredolatti

connorszczepaniak-wk commented 2 months ago

This would be a great change to have. I think having codegen for generic things makes a lot of sense when wanting to serialize structs wrapping arbitrarily-typed values as described in the issue.

Perhaps it would be a requirement that the type parameter(s) are constrained by msgp.Marshaller and msgp.Unmarshaller.

matthinrichsen-wf commented 1 month ago

I took a very rough stab at supporting generics in this library https://github.com/tinylib/msgp/compare/master...matthinrichsen-wf:msgp:generics?expand=1

klauspost commented 1 month ago

It is quite hard to judge without tests. I don't think we can go too wrong with it - since it just doesn't work now - but having a fairly solid "v1" release of it would of course be preferable.

klauspost commented 1 month ago

AFAICT the most difficult part is to support primitive types.

Looking at the most basic example:

type Foo[T any] struct {
    X   string
    Bar T
}

If we generate the methods with proper Foo[T], we still have the problem that we will have to provide functionality for any type of T. AFAICT this means converting to an interface.

MarshalMsg

Trivial. o, err = msgp.AppendIntf(o, z.Bar) should encode it.

EncodeMsg

Trivial. err = en.WriteIntf(z.Bar) should write it.

UnmarshalMsg

Now the fun begins:

        case "Bar":
            bts, err = z.Bar.UnmarshalMsg(bts)

This will not work, since z.Bar doesn't guarantee the compiler that UnmarshalMsg is provided. AFAICT we do not have a function that provides this. As far as I can tell this should do it:

// ReadIntoIntfBytes will read a value and store it in i.
// The destination i must be a pointer, so the value is writable.
func ReadIntoIntfBytes(i any, b []byte) (o []byte, err error) {
    switch i := i.(type) {
    case Unmarshaler:
        return i.UnmarshalMsg(b)
    case Extension:
        return ReadExtensionBytes(b, i)
    case *bool:
        *i, o, err = ReadBoolBytes(b)
        return
    case **bool:
        if IsNil(b) {
            *i = nil
            return ReadNilBytes(b)
        }
        var dst bool
        if dst, o, err = ReadBoolBytes(b); err != nil {
            return nil, err
        }
        *i = &dst
    case *float32:
        *i, o, err = ReadFloat32Bytes(b)
    //..
    }

    // DO reflection similar to AppendIntf.

    return b, fmt.Errorf("cannot unmarshal into %T", i)
}

There is probably also a need for reflect so *Unmarshaler would be handled properly.

The call would be bts, err = msgp.ReadIntoIntfBytes(&z.Bar, bts).

DecodeMsg

Should probably be similar to above.

MsgSize

A function would need to be written that returns size of the interface type.

Array []T and map types map[string]T also seems "interesting", and will probably need some special care.

philhofer commented 1 month ago

The behavior for the current codegen when a type is opaque to the tool (i.e. defined in a different source file) is to assume it implements Marshaler, Unmarshaler, etc. I suspect it would be simplest just to assume the same thing for T any. I suppose could even require any type T for which we are generating code to explicitly satisfy the Marshaler / Unmarshaler / etc. constraints.

On Wed, Aug 14, 2024 at 2:04 AM Klaus Post @.***> wrote:

AFAICT the most difficult part is to support primitive types.

Looking at the most basic example:

type Foo[T any] struct { X string Bar T }

If we generate the methods with proper Foo[T], we still have the problem that we will have to provide functionality for any type of T. AFAICT this means converting to an interface. MarshalMsg

Trivial. o, err = msgp.AppendIntf(o, z.Bar) should encode it. EncodeMsg

Trivial. err = en.WriteIntf(z.Bar) should write it. UnmarshalMsg

Now the fun begins:

  case "Bar":
      bts, err = z.Bar.UnmarshalMsg(bts)

This will not work, since z.Bar doesn't guarantee the compiler that UnmarshalMsg is provided. AFAICT we do not have a function that provides this. As far as I can tell this should do it:

// ReadIntoIntfBytes will read a value and store it in i.// The destination i must be a pointer, so the value is writable.func ReadIntoIntfBytes(i any, b []byte) (o []byte, err error) { switch i := i.(type) { case Unmarshaler: return i.UnmarshalMsg(b) case Extension: return ReadExtensionBytes(b, i) case bool: i, o, err = ReadBoolBytes(b) return case *bool: if IsNil(b) { i = nil return ReadNilBytes(b) } var dst bool if dst, o, err = ReadBoolBytes(b); err != nil { return nil, err } i = &dst case float32: *i, o, err = ReadFloat32Bytes(b) //.. } return b, fmt.Errorf("cannot unmarshal into %T", i) }

There is probably also a need for reflect so *Unmarshaler would be handled properly.

The call would be bts, err = msgp.ReadIntoIntfBytes(&z.Bar, bts). DecodeMsg

Should probably be similar to above. MsgSize

A function would need to be written that returns size of the interface type.

Array []T and map types map[string]T also seems "interesting", and will probably need some special care.

— Reply to this email directly, view it on GitHub https://github.com/tinylib/msgp/issues/327#issuecomment-2288236173, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAWN7ZQV5YNAYAHHGP4IRL3ZRMMS7AVCNFSM6AAAAABLA4IZ4OVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEOBYGIZTMMJXGM . You are receiving this because you are subscribed to this thread.Message ID: @.***>

klauspost commented 1 month ago

@philhofer The issue is that T any doesn't imply to the compiler that it supports Marshaler, so to do valid code you would no longer be able to make it generic, but you'd need to explicitly provide all the interfaces as the type for T.

But yeah, that could be a reasonable first step, but it would be rather limited.

klauspost commented 1 month ago

I checked the asm output from this, and it doesn't seem like the compiler can deduce the type and the any conversion cannot be omitted.

Branches will however be fully predictable, so it will not be the end of the world in terms of performance - but somewhat suboptimal.

In terms of speed, the "foreign type" assumption seems fine.

However, it does place big restrictions on the usage. Let's say we want a generic with a string type (but it could be any primitive).. We try:

type Foo[T Suite] struct {
    Bar T
}

type Suite interface {
    msgp.Marshaler
    msgp.Encodable
    msgp.Unmarshaler
    msgp.Decodable
    msgp.Sizer
}

type MyString string

func test() {
    x := Foo[MyString]{Bar: "abc"}
    fmt.Println(x)
}

We generate the output from this. Except for the missing [T] on the Foo functions it actually looks good.

However the issue then becomes that the non-pointer MyString doesn't support the decode functionality. So this would only work if T is a pointer. That seems like a very big limitation, and it fundamentally changes how I must write my Foo struct, and the encoded value can now be nil.

I hope this makes sense to why I was looking at an approach that involved annoying any conversions and type switches.

I may very well be missing something, but it seems like treating generic types as foreign types will significantly reduce the usability - at least for me.