ugorji / go

idiomatic codec and rpc lib for msgpack, cbor, json, etc. msgpack.org[Go]
MIT License
1.85k stars 295 forks source link

register extension without custom en/decoding #274

Closed geoah closed 5 years ago

geoah commented 5 years ago

As far as I've managed to dig into the docs, registering an extension (cbor tag) requires you to also manually en/decode the data yourself. Both the InterfaceExt and Ext interfaces assume you implement the en/decoding.

I was wondering if there is a way to register a struct as an extension, and let the normal un/marshalling of the struct to take place for its payload.

eg.

type Something struct {
    Foo string `json:"foo"`
}

I'd like for example to be able to register Foo as extension 123 and have the handler marshal the tag, followed by a map with a string element with key foo.

Is there a way to do something simple like this in the current state of the lib that I am missing? If not, could point me to the right direction of what could be done to add it?

Thank you for an amazing library.

ugorji commented 5 years ago

Nah - this is not possible. The library works for multiple encodings, where this idea may not make sense for e.g. msgpack has a different extension model from cbor (which is why we have 2 distinct pairs of extension functions based on the type of extension).

However, you can easily make a private helper function to do this for you, specifically for cbor.

kostko commented 5 years ago

I would also need this, @ugorji can you expand a bit on "you can easily make a private helper function to do this for you"? I see how this could be done by changing the way extension methods are called:

ugorji commented 5 years ago

@kostko I will follow up in a bit.

@geoah I owe you an apology. I just read my comment above, and am totally unsure what I was thinking when I wrote that. I don't think I thought about your question long enough.

Responses coming soon ...

ugorji commented 5 years ago

@geoah @kostko

There isn't an easy way to do this today.

Ergonomics of this package is also very big for me, so I'm always wary of exposing the surface area where it becomes harder to use.

I think I have an idea on how to support this now.

I will provide a global var: SelfExt, which is a BytesExt and InterfaceExt.

You will need to only register types which implement a function: CodecSelfExt() interface{}

The easiest way is the following:

type T struct {
    tHelper
}
func (t *T) CodecSelfExt() { return &t.tHelper }
type tHelper struct {
    // ... all t fields
}

You can then just register them as normal:

cborHandle.SetInterfaceExt(reflect.TypeOf(t), 127, codec.SelfExt) 

With this, I will implement SelfExt and just expose it as a global variable, and can internally implement the mechanics to make this work for both BytesExt and InterfaceExt.

kostko commented 5 years ago

This sounds really good!

So my current hacky way is changing the InterfaceExt to contain a Default() bool which if it is true triggers different processing. I changed fn, encodeValue and decodeValue to take a checkExt boolean which makes it skip extension decoding for a single struct (in decoder.kInterfaceNaked, cbor.DecodeExt and cbor.EncodeExt).

ugorji commented 5 years ago

I would also need this, @ugorji can you expand a bit on "you can easily make a private helper function to do this for you"? I see how this could be done by changing the way extension methods are called:

I truly do not remember what I was trying to suggest there.

  • CBOR's DecodeExt would decode into rv directly instead of the empty interface.
  • CBOR's EncodeExt would need to signal to encode that it should not try to call the extension again to avoid infinite recursion.

We cannot do this because extensions are registered by type. Consequently, trying to decode back into yourself will lead to an infinite loop, as we will dynamically ask for the "code" that handles encoding/decoding that type, and it will get back to us.

There is no way to signal that up, since we are just registering on type itself. To signal, we would have to keep track of objects also (like what SetFinalizer does) and that is too hairy to even think through.

The way to do it is with a helper unregistered object, like in the comment above https://github.com/ugorji/go/issues/274#issuecomment-496918456

ugorji commented 5 years ago

I had to try it out quickly.

@kostko

Please test out the fix and let me know. The usage is simple:

https://github.com/ugorji/go/blob/449ecf999a86727ce57d569af80bb95b1cc71b28/codec/values_flex_test.go#L138-L151 https://github.com/ugorji/go/blob/449ecf999a86727ce57d569af80bb95b1cc71b28/codec/codec_test.go#L135 https://github.com/ugorji/go/blob/449ecf999a86727ce57d569af80bb95b1cc71b28/codec/codec_test.go#L427-L433 https://github.com/ugorji/go/blob/449ecf999a86727ce57d569af80bb95b1cc71b28/codec/codec_test.go#L2483-L2498

Tests included: run using go test -run SelfExt

Hope this helps.

@geoah I should have looked deeper into your request. I looked too quickly then. That's my bad. Please, next time, just ping me again if you don't agree with what I said, so I can take a second look.

kostko commented 5 years ago

Thanks for your quick reply and a proposed fix!

So I checked it out and while this works it seems a bit awkward to me to require the structs to be changed to embed a serialization helper struct to be able to use this.

My approach is slightly different and doesn't require any modification to structs, you can check it out here: https://github.com/kostko/go-codec/commit/07ccbbbe8f7fbb6456c9a6060574ebb46bd6240b

This currently breaks backward compatibility as it adds the UseDefault method to InterfaceExt which is not ideal, but this part can easily be refactored.

ugorji commented 5 years ago

@kostko

Thanks for your quick reply and a proposed fix!

So I checked it out and while this works it seems a bit awkward to me to require the structs to be changed to embed a serialization helper struct to be able to use this.

That's a fair position. I agree that the model is not ideal. My approach is slightly different and doesn't require any modification to structs, you can check it out here: kostko@07ccbbb

I looked through your change and had a few concerns with it. It works, and works elegantly within the design that we have now.

But that (my original) design is flawed and needs to be fixed before we build anything else atop it.

For one, the fact that we pass checkFastpath and checkCodecSelfer were code smells, that something wasn't right with the design. Fundamentally, the fn() should be a static function that always returns the same value for a given context (since it caches them). I need to just re-architect it and remove those functions.

This currently breaks backward compatibility as it adds the UseDefault method to InterfaceExt which is not ideal, but this part can easily be refactored.

I don't like the idea of UseDefault. It kind of does the same think like checkFastpath, where it may give you the pre-cached value, or if not, it caches one only designed for ext=true or false.

Fundamentally, I just have to redesign the library to fix the design flaw and then support this.

I will submit something soon.

kostko commented 5 years ago

I agree with your assessment, so this sounds like a good plan. Let me know once you have something ;-)

kostko commented 5 years ago

Cool that was fast, I like the interface!

There is however one case that this misses, the case when a kNakedInterface decoding is used - that doesn't know about the special SelfExt sentinel and tries to call UpdateExt. Also the tests fixture still has some remains of the old interface.

ugorji commented 5 years ago

The test fixtures are fine. I updated them just enough, but I like that we bundle testing embedded types in it.

The kNakedInterface is just ... wrong. Phew! I will send a fix soon.

kostko commented 5 years ago

Yeah the embedded types are fine, but the old Codec* methods are likely not needed?

kostko commented 5 years ago

Works perfectly now, thank you very much for integrating this so quickly!