Open mredolatti opened 1 year ago
codegen + generics ... srsly? @mredolatti
This would be a great change to have. I think having codegen for generic things makes a lot of sense when wanting to serialize structs wrapping arbitrarily-typed values as described in the issue.
Perhaps it would be a requirement that the type parameter(s) are constrained by msgp.Marshaller
and msgp.Unmarshaller
.
I took a very rough stab at supporting generics in this library https://github.com/tinylib/msgp/compare/master...matthinrichsen-wf:msgp:generics?expand=1
It is quite hard to judge without tests. I don't think we can go too wrong with it - since it just doesn't work now - but having a fairly solid "v1" release of it would of course be preferable.
AFAICT the most difficult part is to support primitive types.
Looking at the most basic example:
type Foo[T any] struct {
X string
Bar T
}
If we generate the methods with proper Foo[T]
, we still have the problem that we will have to provide functionality for any type of T. AFAICT this means converting to an interface.
Trivial. o, err = msgp.AppendIntf(o, z.Bar)
should encode it.
Trivial. err = en.WriteIntf(z.Bar)
should write it.
Now the fun begins:
case "Bar":
bts, err = z.Bar.UnmarshalMsg(bts)
This will not work, since z.Bar
doesn't guarantee the compiler that UnmarshalMsg
is provided. AFAICT we do not have a function that provides this. As far as I can tell this should do it:
// ReadIntoIntfBytes will read a value and store it in i.
// The destination i must be a pointer, so the value is writable.
func ReadIntoIntfBytes(i any, b []byte) (o []byte, err error) {
switch i := i.(type) {
case Unmarshaler:
return i.UnmarshalMsg(b)
case Extension:
return ReadExtensionBytes(b, i)
case *bool:
*i, o, err = ReadBoolBytes(b)
return
case **bool:
if IsNil(b) {
*i = nil
return ReadNilBytes(b)
}
var dst bool
if dst, o, err = ReadBoolBytes(b); err != nil {
return nil, err
}
*i = &dst
case *float32:
*i, o, err = ReadFloat32Bytes(b)
//..
}
// DO reflection similar to AppendIntf.
return b, fmt.Errorf("cannot unmarshal into %T", i)
}
There is probably also a need for reflect
so *Unmarshaler
would be handled properly.
The call would be bts, err = msgp.ReadIntoIntfBytes(&z.Bar, bts)
.
Should probably be similar to above.
A function would need to be written that returns size of the interface type.
Array []T
and map types map[string]T
also seems "interesting", and will probably need some special care.
The behavior for the current codegen when a type is opaque to the tool
(i.e. defined in a different source file) is to assume it implements
Marshaler
, Unmarshaler
, etc. I suspect it would be simplest just to
assume the same thing for T any
. I suppose could even require any type
T
for which we are generating code to explicitly satisfy the Marshaler
/ Unmarshaler
/ etc. constraints.
On Wed, Aug 14, 2024 at 2:04 AM Klaus Post @.***> wrote:
AFAICT the most difficult part is to support primitive types.
Looking at the most basic example:
type Foo[T any] struct { X string Bar T }
If we generate the methods with proper Foo[T], we still have the problem that we will have to provide functionality for any type of T. AFAICT this means converting to an interface. MarshalMsg
Trivial. o, err = msgp.AppendIntf(o, z.Bar) should encode it. EncodeMsg
Trivial. err = en.WriteIntf(z.Bar) should write it. UnmarshalMsg
Now the fun begins:
case "Bar": bts, err = z.Bar.UnmarshalMsg(bts)
This will not work, since z.Bar doesn't guarantee the compiler that UnmarshalMsg is provided. AFAICT we do not have a function that provides this. As far as I can tell this should do it:
// ReadIntoIntfBytes will read a value and store it in i.// The destination i must be a pointer, so the value is writable.func ReadIntoIntfBytes(i any, b []byte) (o []byte, err error) { switch i := i.(type) { case Unmarshaler: return i.UnmarshalMsg(b) case Extension: return ReadExtensionBytes(b, i) case bool: i, o, err = ReadBoolBytes(b) return case *bool: if IsNil(b) { i = nil return ReadNilBytes(b) } var dst bool if dst, o, err = ReadBoolBytes(b); err != nil { return nil, err } i = &dst case float32: *i, o, err = ReadFloat32Bytes(b) //.. } return b, fmt.Errorf("cannot unmarshal into %T", i) }
There is probably also a need for reflect so *Unmarshaler would be handled properly.
The call would be bts, err = msgp.ReadIntoIntfBytes(&z.Bar, bts). DecodeMsg
Should probably be similar to above. MsgSize
A function would need to be written that returns size of the interface type.
Array []T and map types map[string]T also seems "interesting", and will probably need some special care.
— Reply to this email directly, view it on GitHub https://github.com/tinylib/msgp/issues/327#issuecomment-2288236173, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAWN7ZQV5YNAYAHHGP4IRL3ZRMMS7AVCNFSM6AAAAABLA4IZ4OVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEOBYGIZTMMJXGM . You are receiving this because you are subscribed to this thread.Message ID: @.***>
@philhofer The issue is that T any
doesn't imply to the compiler that it supports Marshaler
, so to do valid code you would no longer be able to make it generic, but you'd need to explicitly provide all the interfaces as the type for T.
But yeah, that could be a reasonable first step, but it would be rather limited.
I checked the asm output from this, and it doesn't seem like the compiler can deduce the type and the any
conversion cannot be omitted.
Branches will however be fully predictable, so it will not be the end of the world in terms of performance - but somewhat suboptimal.
In terms of speed, the "foreign type" assumption seems fine.
However, it does place big restrictions on the usage. Let's say we want a generic with a string type (but it could be any primitive).. We try:
type Foo[T Suite] struct {
Bar T
}
type Suite interface {
msgp.Marshaler
msgp.Encodable
msgp.Unmarshaler
msgp.Decodable
msgp.Sizer
}
type MyString string
func test() {
x := Foo[MyString]{Bar: "abc"}
fmt.Println(x)
}
We generate the output from this. Except for the missing [T]
on the Foo functions it actually looks good.
However the issue then becomes that the non-pointer MyString
doesn't support the decode functionality. So this would only work if T is a pointer. That seems like a very big limitation, and it fundamentally changes how I must write my Foo struct, and the encoded value can now be nil.
I hope this makes sense to why I was looking at an approach that involved annoying any
conversions and type switches.
I may very well be missing something, but it seems like treating generic types as foreign types will significantly reduce the usability - at least for me.
Hi! are there any plans to support generic structs? Something like this:
will try to generate code as if
Response
was indeed a struct and instead of a template.Thanks!