ugorji / go

idiomatic codec and rpc lib for msgpack, cbor, json, etc. msgpack.org[Go]
MIT License
1.85k stars 295 forks source link

bug: MapValueReset has no effect #308

Closed primalmotion closed 5 years ago

primalmotion commented 5 years ago

The sample code below doesn't seem to work as it should:

package main

import (
    "bytes"
    "fmt"

    "github.com/ugorji/go/codec"
)

func main() {

    h := &codec.JsonHandle{}
    h.MapValueReset = true

    d := bytes.NewBuffer([]byte(`{"map": {"a": 1}}`))
    dec := codec.NewDecoder(d, h)

    o := struct {
        Map map[string]interface{} `json:"map"`
    }{}

    o.Map = map[string]interface{}{"b": 1}

    if err := dec.Decode(&o); err != nil {
        panic(err)
    }

    fmt.Println(o.Map)
}

This prints:

map[a:1 b:1]

Since I set MapValueReset = true, I would expect it to print:

map[a:1]
ugorji commented 5 years ago

Hi,

MapValueReset means that we grab the value from the map and decode into that.

For example, suppose you have:

type S struct {
  K int
  L string
}

var v = map[int]S{1:S{K:1, L: "99"}}

Then if you have a stream that has json like:

{1:{L: "11"}}

After decoding, the v will now be:

map[int]S{1:S{K:1, L: "11"}}

if MapValueReset=false, then the v after decoding will be:

map[int]S{1:S{K:0, L: "11"}}

i.e. MapValueReset determines if we will grab the value from the map and update it, as opposed to just create a new value and only update the values in the stream.

See

primalmotion commented 5 years ago

So I should set MapValueReset = false to actually reset (ie get a brand new empty map?)

ugorji commented 5 years ago

Decoding always updates.

You cannot selectively create a new value for a part of something being decoded into.

If you want just the value in the stream, then decode into a zero'ed object i.e.

var v T
...Decode(&v)

But if you are decoding into an already populated object, then we provide an option that says to decode into the value already in the map. It's then no different from decoding into a struct field, where we grab the struct field, and update it.

ugorji commented 5 years ago

See https://godoc.org/github.com/ugorji/go/codec#Decoder.Decode

It is explained in detail there.

primalmotion commented 5 years ago

EDIT: scratch that I got it

I'm sorry I'm having trouble understanding this option.

The doc states:

This in-place update maintains consistency in the decoding philosophy (i.e. we ALWAYS update in place by default). However, the consequence of this is that values in slices or maps which are not zero'ed before hand, will have part of the prior values in place after decode if the stream doesn't contain an update for those parts. This in-place update can be disabled by configuring the MapValueReset and SliceElementReset decode options available on every handle.

I read that if I set the MapValueReset to true, it will disable the in place update no?

primalmotion commented 5 years ago

Also we noticed that since 1.1.7, a tons of part of our application just broke because of garbage values in maps and slice that was not happening on <1.1.6

primalmotion commented 5 years ago

So here's where I am with my investigation:

package main

import (
    "bytes"
    "fmt"

    "github.com/ugorji/go/codec"
)

type Wrapper struct {
    Thing *Thing `json:"thing" msgpack:"thing"`
}

type Thing struct {
    String string `json:"string" msgpack:"string"`
}

func main() {
    decode(`{}`)
    decode(`{"thing": {}}`)
    decode(`{"thing": null}`)
}

func decode(data string) {

    t := &Thing{String: "hello"}
    w := &Wrapper{Thing: t}

    h := &codec.JsonHandle{}
    buff := bytes.NewBuffer([]byte(data))

    dec := codec.NewDecoder(buff, h)
    if err := dec.Decode(w); err != nil {
        panic(err)
    }

    fmt.Println("input:", data)
        fmt.Println("t.String?:", t.String)
    fmt.Println()
}

This code will produce with 1.1.4 (and 1.1.5-pre, and 1.1.5-alpha)

$ go run main.go
input: {}
t.String?: hello

input: {"thing": {}}
t.String?: hello

input: {"thing": null}
t.String?: hello

This code will produce with 1.1.6 and 1.1.7 and master

$ go run main.go
input: {}
t.String?: hello

input: {"thing": {}}
t.String?: hello

input: {"thing": null}
t.String?: 
primalmotion commented 5 years ago

I replaced go-codec with standard encoding/json and it produces the same result I would expect, being

go run main.go
input: {}
t.String?: hello

input: {"thing": {}}
t.String?: hello

input: {"thing": null}
t.String?: hello
ugorji commented 5 years ago
input: {"thing": null}
t.String?: 

This is correct.

null in a stream means the zero value. So - we will replace the mapping to "thing" with the zero value of string, which is "".

The documentation makes this explicit, in the summary section:

- NIL in data stream decoded as zero value

and in the Decode documentation:

However, when decoding a stream nil, we reset the destination container to its "zero" value (e.g. nil for slice/map, etc).

Previously, we were not consistent in nil handling. That is one of the things we cleaned up across the board in the code. NULL/NIL in the stream always means the zero value. If it was a slice/map/pointer, we set it to nil. If a number, we set it to 0. If a bool, we set it to false. If a string, we set it to "".

As part of streamlining this, we deprecated a flag: DeleteOnNilMapValue

See the documentation, quoted here:

 // DeleteOnNilMapValue controls how to decode a nil value in the stream.
    //
    // If true, we will delete the mapping of the key.
    // Else, just set the mapping to the zero value of the type.
    //
    // Deprecated: This does NOTHING and is left behind for compiling compatibility.
    // This change is necessitated because 'nil' in a stream now consistently
    // means the zero value (ie reset the value to its zero state).
    DeleteOnNilMapValue bool

Hope this helps clarify things.

encoding/json is not as consistent, decoding null as nil for pointers, slices, maps, interfaces, etc but ignoring it for numbers, string, bool. I think that inconsistency a mistake and we handled it more consistently in go-codec. I can understand if others feel differently, but this is one that I thought long and hard about before settling on this.

ugorji commented 5 years ago

Question: why do you have json and msgpack in your struct tags?

By default, go-codec uses json or codec, while encoding/json uses json. You can however configure to use your own struct tags as needed, by setting your own TypeInfos. Just FYI.

primalmotion commented 5 years ago

ok so it's a bug fix. But now if I want to implement a patch (meaning don't touch that value) how would I do this? Let's say I have

type T1 struct {
     A string
}
type T2 struct {
    A *T1
    B *T1
}

o := T2{
    A: &T1{A: "hello"}
}

If I encode o and send it to the server, the server decodes it and it will get

T2{
    A: &T1{A: "hello"}
    B: &T1{A: ""}
}

Thus it will not be able to guess that the request was a patch on A only, Or Am I missing something else?

primalmotion commented 5 years ago

Question: why do you have json and msgpack in your struct tags?

We tried many encoders, and we just kept them for backward compat (we use msgpackHandle.TypeInfos = codec.NewTypeInfos([]string{"msgpack"}))

ugorji commented 5 years ago

A patch will work. We only touch fields/mappings/slice or array elements with contents in the stream.

So, if you already had a value v which was

v = T2{
  A: &T1{A: "AA"}, 
  B: &T1{A: "AB"},
}

And you decode the equivalent of o above using omitempty, so that o.B is not in the stream, then your encoded stream will be like:

{"A": {"A": "hello"}}

If you then decode this into v above, after decoding, v will be:

v = T2{
  A: &T1{A: "hello"}, 
  B: &T1{A: "AB"},
}

i.e. B will not be touched. B is only touched if B was in the stream i.e. if you were decoding the json stream like

{"A": {"A": "hello"}, "B": null}

or otherwise.

Hope this explains things.

i.e. for a patch, use omitempty or use a patch object to encode the value, and decode into the large object, so you are only passing the values you want to be into the stream.

ugorji commented 5 years ago

In your benchmarks, you should see some remarkable performance improvements, and i have a few tee'd up for 1.1.8. Just FYI ...

primalmotion commented 5 years ago

ok that's what I though. We have some idiotic test suite that prevent us to use omitempty. A good reason to remind them to fix that crap.

I know you have made a ton of improvement which is why I really don't want to get stuck with 1.1.4 :)

Thanks for all the explanations!

john8329 commented 1 year ago

Hi and sorry for writing in a closed issue, but I think it'd be a good idea to write in the docs that the MapValueReset option doesn't actually delete keys that are missing from json. For example I expected a map[string]string to have its keys deleted rather than its values set as zero. It's trivial only after you understand it.