ugorji / go

idiomatic codec and rpc lib for msgpack, cbor, json, etc. msgpack.org[Go]
MIT License
1.87k stars 295 forks source link

Embedded structs cause decoding issues. #25

Closed aarondl closed 11 years ago

aarondl commented 11 years ago

I'm sorry I couldn't get a smaller test case for this, I'm short on time this morning. This is trying to decode a payload from Python (hence the ugly string constant).

If you name the Bstruct's *Astruct field instead of embedding it, it decodes as expected. If you don't you get the error message:

2013/11/15 09:01:54 Failed to decode to struct:
codec.decoder: readContainerLen: Unrecognized descriptor byte: hex: 95, dec: 149
package main

import (
    "github.com/ugorji/go/codec"
    "bytes"
    "log"
    "github.com/davecgh/go-spew/spew"
)

type Astruct struct {
    Abyte1 []byte
    Abyte2 []byte
    Auint1 uint64
    Auint2 uint64
    Auint3 uint8
}

type Bstruct struct {
    *Astruct
    Bint1  int32
    Buint1 uint64
    Buint2 uint64
    Bint2  int32
    Bint3  int32
}

type Cstruct struct {
}

type Dstruct struct {
    Dmap map[string]*Bstruct
    Dslice []*Cstruct
}

var test = []byte("\x92\x81\xa4test\x96\x95\xda\x00 \x13\n\x1e\x12" +
            "\x15t\xf6y\x8e\xf6\xcd\xc3\xde)t2\xfc\xb7\xe0\xcf\xbd\xd9" +
            "\xbd\xc2$\xdc\x01\xc5=89\xc9\xda\x00 `\xcb1\xbe+\x1aGr" +
            "\x1a\x06\xef\xf3)\xc4\xcf\xef\xfcp\xac\xa9-\n\xe1TV\xb0" +
            "\xf8\xd5EX\xceH$\x0b\x01\x0b\x00\x00\x00\x00\x90")

func main() {
    var m codec.MsgpackHandle

    d := Dstruct{}
    decoder := codec.NewDecoder(bytes.NewBuffer(test), &m)
    if err := decoder.Decode(&d); err != nil {
        log.Println("Failed to decode to struct:", err)
    } else {
        log.Println(d)
    }

    var i interface{}
    decoder = codec.NewDecoder(bytes.NewBuffer(test), &m)
    if err := decoder.Decode(&i); err != nil {
        log.Println("Failed to decode to interface:", err)
    } else {
        spew.Dump(i)
    }
}

Output:

2013/11/15 09:01:54 Failed to decode to struct: codec.decoder: readContainerLen: Unrecognized descriptor byte: hex: 95, dec: 149
([]interface {}) {
 (map[interface {}]interface {}) {
  (string) "test": ([]interface {}) {
   ([]interface {}) {
    ([]uint8) {
     00000000  13 0a 1e 12 15 74 f6 79  8e f6 cd c3 de 29 74 32  |.....t.y.....)t2|
     00000010  fc b7 e0 cf bd d9 bd c2  24 dc 01 c5 3d 38 39 c9  |........$...=89.|
    },
    ([]uint8) {
     00000000  60 cb 31 be 2b 1a 47 72  1a 06 ef f3 29 c4 cf ef  |`.1.+.Gr....)...|
     00000010  fc 70 ac a9 2d 0a e1 54  56 b0 f8 d5 45 58 ce 48  |.p..-..TV...EX.H|
    },
    (int64) 36,
    (int64) 11,
    (int64) 1
   },
   (int64) 11,
   (int64) 0,
   (int64) 0,
   (int64) 0,
   (int64) 0
  }
 },
 ([]interface {}) {
 }
}
ugorji commented 11 years ago

I wouldn't get to this one soon.

Just so you are not stuck, you can change the const recoverPanicToErr to true (line 46 of helper.go). This will propagate the panic and crash the program with a stack trace when an error happens, and can get some insight into where the error is originating from.

ugorji commented 11 years ago

Can you work on getting this issue to a self-contained one, so I can fix.

Something short and straight to the point, like all your other issues.

Thanks.

aarondl commented 11 years ago

Yes I can. I actually have no idea what's going wrong with it, so that's why I couldn't condense it further. I'll dig in more and try to get a smaller reproduce.

aarondl commented 11 years ago

Okay, here's a new reproduce. The error should be easier to see since I've made the number of structs smaller, and the msgpack much simpler.

This snippet fails to decode if the Astruct inside Bstruct is embedded, if it is named (not embedded) then I receive no error and it decodes correctly. The error is as follows:

2013/11/16 09:07:18 Failed to decode to struct: codec.decoder:
Unhandled single-byte unsigned integer value: Unrecognized descriptor byte: c4

Now, the error message is different but I believe it's a symptom of the same underlying problem. If we can get this one fixed, I'll verify that it fixes the original snippet as well.

package main

import (
    "github.com/ugorji/go/codec"
    "bytes"
    "log"
    "github.com/davecgh/go-spew/spew"
)

type Astruct struct {
    Abyte1 []byte
}

type Bstruct struct {
    *Astruct
    Bint1     int32
}

var test1 = []byte{
    0x92, //Bstruct{
    0x91, //Astruct{
    0xc4, 0x5, 0x1, 0x2, 0x3, 0x4, 0x5, // []byte{1, 2, 3, 4, 5}
    0x6, // Bint1 = 6
}

func main() {
    var m codec.MsgpackHandle
    m.StructToArray = true
    m.WriteExt = true

    b := Bstruct{}

    decoder := codec.NewDecoder(bytes.NewBuffer(test1), &m)
    if err := decoder.Decode(&b); err != nil {
        log.Println("Failed to decode to struct:", err)
    } else {
        spew.Dump(b)
    }

    var i interface{}
    decoder = codec.NewDecoder(bytes.NewBuffer(test1), &m)
    if err := decoder.Decode(&i); err != nil {
        log.Println("Failed to decode to interface:", err)
    } else {
        spew.Dump(i)
    }
}

Output when Bstruct's Astruct is not named:

2013/11/16 09:02:59 Failed to decode to struct: codec.decoder:
Unhandled single-byte unsigned integer value: Unrecognized descriptor byte: c4
([]interface {}) {
 ([]interface {}) {
  ([]uint8) {
   00000000  01 02 03 04 05                                    |.....|
  }
 },
 (int64) 6
}

Output when Bstruct's Astruct is named:

(main.Bstruct) {
 A: (*main.Astruct)(0xc2000fa280)({
  Abyte1: ([]uint8) {
   00000000  01 02 03 04 05                                    |.....|
  }
 }),
 Bint1: (int32) 6
}
([]interface {}) {
 ([]interface {}) {
  ([]uint8) {
   00000000  01 02 03 04 05                                    |.....|
  }
 },
 (int64) 6
}
ugorji commented 11 years ago

K, this is a user error.

https://github.com/ugorji/go/blob/master/codec/encode.go#L663

// Anonymous fields are encoded inline if no struct tag is present. // Else they are encoded as regular fields.

So the encoding will be like this was encoded: type Bstruct struct { Abyte1 []byte Bint1 int32 }

However, you are passing something that looks like type Bstruct struct { type Astruct struct { Abyte1 []byte } Bint1 int32 }

It wouldn't work. Your struct is not aligned with the encoded data (the encoded data says that an array is inside an array).

If you want it to work, you have 2 options:

  1. Add a struct tag to the embedded field, so the Astruct is not encoded/decoded as if it's members are inline i.e. type Bstruct { *Astruct codec:"A"; Bint1 int32 }
  2. Do not have the Astruct be embedded

With that, things work, and error goes away.

To see how this works, you should set the const recoverPanicToErr to true (line 46 of helper.go). This way, when an error occurs, we don't recover the panic and you see where in the code it occurred, and you can follow it. It may help.

aarondl commented 11 years ago

This encoding was actually done by the C library wrapped in Python not by me. The original big scary example has data directly from that encoder. I simply reduced the problem to be this small bit.

I haven't used reflect enough to know if there's a way we can find out if the struct was embedded to detect and handle this case. However, if you're satisfied with this resolution then I am. Being able to embed the struct and encode/decode properly with a struct field literal tag is an acceptable workaround. Thanks.

ugorji commented 11 years ago

its actually not a work around per se. and the problem wasn't with the encoding. the problem was the struct. in this scenario, the lib usage is that fields of embedded struct members are encoded/decoded inline, except a strict tag is used to force the regular enc/Dec. this is one of those things I thought hard about ;)

(on go ... typing from phone ... enjoy) On Nov 15, 2013 9:06 PM, "Aaron L" notifications@github.com wrote:

This encoding was actually done by the C library wrapped in Python not by me. The original big scary example has data directly from that encoder. I simply reduced the problem to be this small bit.

I haven't used reflect enough to know if there's a way we can find out if the struct was embedded to detect and handle this case. However, if you're satisfied with this resolution then I am. Being able to embed the struct and encode/decode properly with a struct field literal tag is an acceptable workaround. Thanks.

— Reply to this email directly or view it on GitHubhttps://github.com/ugorji/go/issues/25#issuecomment-28617396 .

aarondl commented 11 years ago

Either way, let's consider this one buried. I think with all these issues out of the way, I can start coding haha. Thanks again!