ugorji / go

idiomatic codec and rpc lib for msgpack, cbor, json, etc. msgpack.org[Go]
MIT License
1.85k stars 295 forks source link

Decoding fixstr into interface{} not working #315

Closed lgrip closed 4 years ago

lgrip commented 5 years ago

We have an object with an interface{} value which can either be a string or a []byte. On decoding the value is always set to a []byte value unless we set the RawToString option to true. Then the value is always a string. This makes it impossible to tell a string from a []byte value for a given field.

We can see in the byte stream that the string is encoded as a fixstr with a 0xaN type.

Is this the way this should work?

What are we missing?

func TestMsgpackInterface(t *testing.T) {
    type A struct {
        S string
        Y []byte
    }
    type B struct {
        S interface{}
        Y interface{}
    }

    var b bytes.Buffer
    e := codec.NewEncoder(&b, new(codec.MsgpackHandle))
    if err := e.Encode(A{"foo", []byte{0xca, 0xfe}}); err != nil {
        t.Fatal("encode", err)
    }
    buf := b.Bytes()
    fmt.Printf("%#v\n", buf)

    t.Run("default", func(t *testing.T) {
        var b B
        h := new(codec.MsgpackHandle)
        codec.NewDecoderBytes(buf, h).MustDecode(&b)
        if got, want := b, (B{"foo", []byte{0xca, 0xfe}}); !reflect.DeepEqual(got, want) {
            t.Logf("got  %#v", got)
            t.Logf("want %#v", want)
            t.Fail()
        }
    })
    t.Run("RawToString", func(t *testing.T) {
        var b B
        h := new(codec.MsgpackHandle)
        h.DecodeOptions.RawToString = true
        codec.NewDecoderBytes(buf, h).MustDecode(&b)
        if got, want := b, (B{"foo", []byte{0xca, 0xfe}}); !reflect.DeepEqual(got, want) {
            t.Logf("got  %#v", got)
            t.Logf("want %#v", want)
            t.Fail()
        }
    })
}
$ go test -v -run Interfa
=== RUN   TestMsgpackInterface
[]byte{0x82, 0xa1, 0x53, 0xa3, 0x66, 0x6f, 0x6f, 0xa1, 0x59, 0xa2, 0xca, 0xfe}
=== RUN   TestMsgpackInterface/default
=== RUN   TestMsgpackInterface/RawToString
--- FAIL: TestMsgpackInterface (0.00s)
    --- FAIL: TestMsgpackInterface/default (0.00s)
        raw_test.go:175: got  incore_vision.B{S:[]uint8{0x66, 0x6f, 0x6f}, Y:[]uint8{0xca, 0xfe}}
        raw_test.go:176: want incore_vision.B{S:"foo", Y:[]uint8{0xca, 0xfe}}
    --- FAIL: TestMsgpackInterface/RawToString (0.00s)
        raw_test.go:186: got  incore_vision.B{S:"foo", Y:"\xca\xfe"}
        raw_test.go:187: want incore_vision.B{S:"foo", Y:[]uint8{0xca, 0xfe}}
FAIL
ugorji commented 5 years ago

I am not sure what is now working.

What do you expect? What do you see?

lgrip commented 5 years ago

We expect string values to be decoded as strings and []byte values to be decoded as []byte.

Now, it seems like we need to choose to decode intercface{} values either as []byte (normal settings) or everything as string (RawToString == true).

csm commented 4 years ago

I'm running into something similar; my expectation with RawToString=true is that string types in msgpack (fixstr, str8, str16, str32) are decoded as string, but binary values (bin8, bin16, bin32) should be decoded as []byte. We have an app that has to distinguish between strings and bytes, so this is a blocker for us using this package.

In particular this line https://github.com/ugorji/go/blob/master/codec/msgpack.go#L574 looks wrong, it should just pass false to fauxUnionReadRawBytes. Moreover, this line https://github.com/ugorji/go/blob/master/codec/msgpack.go#L818 seems wrong as-is, since decoding with RawToString=true decodes the value as string, not []byte.

dionytadema commented 4 years ago

This issue has impact on the Nexus wamp router too, with both strings and []bytes ending up as strings when routed. Causing autobahn|python clients to trow protocol errors.

see https://github.com/gammazero/nexus/issues/214

dionytadema commented 4 years ago

can we get an update on this issue, there is a PR that fixes it and would help out anyone trying to work with both strings and bytes

ugorji commented 4 years ago

In msgpack, there are 2 supported specs.

In the old spec: fixraw was 101xxxxx (for raw bytes). This is now called fixstr in the new spec (for strings).

See

We determine whether to support the new or old spec based on the value of WriteExt (true or false).

If you want fixstr to be decoded as a string, it means that you want to honor the new spec, so you should set WriteExt=true.

If however, WriteExt=false (meaning you want to honor the old msgpack spec), but you want to fixraw to be decoded as string (not []byte), then you should set RawToString=true also.

It is hard for me to parse what you mean when you say:

We expect string values to be decoded as strings and []byte values to be decoded as []byte.

Hopefully, the above makes it clear how things should work.

I tested this on an adaptation of your code and it works fine:

func TestGithub315(t *testing.T) {
    type A struct {
        S string
        Y []byte
    }
    type B struct {
        S interface{}
        Y interface{}
    }

    h := new(codec.MsgpackHandle)
    h.WriteExt = true
    var b bytes.Buffer
    e := codec.NewEncoder(&b, h)
    if err := e.Encode(A{"foo", []byte{0xca, 0xfe}}); err != nil {
        t.Fatal("encode", err)
    }
    buf := b.Bytes()
    fmt.Printf("buf: %#v\n", buf)

    var b2 B
    codec.NewDecoderBytes(buf, h).MustDecode(&b2)
    if got, want := b2, (B{"foo", []byte{0xca, 0xfe}}); !reflect.DeepEqual(got, want) {
        t.Logf("got  %#v", got)
        t.Logf("want %#v", want)
        t.Fail()
    }
}
ugorji commented 4 years ago

In summary:

You need to set WriteExt=true so that the new spec is honored. When we built this library, most uses were using the old spec and the new spec was in its infancy, so it is an opt-in. Opt into it by setting the flag WriteExt=true.