ugorji / go

idiomatic codec and rpc lib for msgpack, cbor, json, etc. msgpack.org[Go]
MIT License
1.85k stars 295 forks source link

Inconsistency in json decoding compared to encoding/json.Unmarshal #323

Closed anirbanmu closed 4 years ago

anirbanmu commented 4 years ago

For something like this: {"body":"\ud83e\udd23u\ud83e\udd23p\ud83e\udd23"}, this library's decoding is inconsistent with the standard golang unmarshal.

encoding/json produces: 🤣u🤣p🤣 this library producers: �u�p�

Let me know if I'm using this library incorrectly, thanks.

repro:

package main

import (
    "github.com/ugorji/go/codec"
    "log"
    "strings"
    "encoding/json"
)

func main() {
    body := `{"body":"\ud83e\udd23u\ud83e\udd23p\ud83e\udd23"}`

    reader := strings.NewReader(body)
    decoder := codec.NewDecoder(reader, &codec.JsonHandle{})

    var out struct{
        Body string `json:"body"`
    }

    _ = decoder.Decode(&out)

    log.Printf("codec: %s", out.Body)

    var out1 struct{
        Body string `json:"body"`
    }

    _ = json.Unmarshal([]byte(body), &out1)
    log.Printf("std: %s", out1.Body)
}