golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
123.36k stars 17.58k forks source link

encoding/json: JSON tags don't handle empty properties, non-standard characters #22518

Open schani opened 6 years ago

schani commented 6 years ago

What did you do?

https://play.golang.org/p/RRB1VFNufW

Trying to unmarshal with tags like this:

type Data struct {
    Foo    string `json:"Foo"`
    Empty  string `json:""`
    Quote  string "json:\"\\\""
    Smiley string "json:\"\U0001F610\""
}

What did you expect to see?

{"Foo": "bla", "": "nothing", "\"": "quux", "😐": ":-|"}
{"Foo":"bla","":"nothing","\"":"quux","😐":":-|"}

What did you see instead?

{"Foo": "bla", "": "nothing", "\"": "quux", "😐": ":-|"}
{"Foo":"bla","Empty":"","Quote":"","Smiley":""}

System details

go version go1.9.2 darwin/amd64
GOARCH="amd64"
GOBIN=""
GOEXE=""
GOHOSTARCH="amd64"
GOHOSTOS="darwin"
GOOS="darwin"
GOPATH="/Users/schani/go"
GORACE=""
GOROOT="/usr/local/go"
GOTOOLDIR="/usr/local/go/pkg/tool/darwin_amd64"
GCCGO="gccgo"
CC="clang"
GOGCCFLAGS="-fPIC -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/1h/7ghts1ys53xdk8y1czrjzdmw0000gn/T/go-build770874315=/tmp/go-build -gno-record-gcc-switches -fno-common"
CXX="clang++"
CGO_ENABLED="1"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOROOT/bin/go version: go version go1.9.2 darwin/amd64
GOROOT/bin/go tool compile -V: compile version go1.9.2
uname -v: Darwin Kernel Version 16.7.0: Thu Jun 15 17:36:27 PDT 2017; root:xnu-3789.70.16~2/RELEASE_X86_64
ProductName:    Mac OS X
ProductVersion: 10.12.6
BuildVersion:   16G29
lldb --version: lldb-900.0.50.1
  Swift-4.0
gdb --version: GNU gdb (GDB) 7.12.1
ianlancetaylor commented 6 years ago

This is because of isValidTag (https://golang.org/src/encoding/json/encode.go#L805). Your tags are not considered to be valid, and are therefore ignored. isValidTag was implemented to fix #1520, but I think that issue was before the standard field tag format.

ianlancetaylor commented 6 years ago

Do other JSON encoders support field names that are not ordinary identifiers?

schani commented 6 years ago

@ianlancetaylor Newtonsoft.JSON for .NET handles all those cases.

ibrt commented 6 years ago

Might be worth noting that the JSON spec(s) [1] [2] allow for any unicode character in object keys.

JSON.parse (ECMAScript 5.1 and up) also supports that.

gopherbot commented 6 years ago

Change https://golang.org/cl/140305 mentions this issue: encoding/json: permit any string as a struct field name

ghost commented 5 years ago

@rsc Should this issue be considered with "JSON sweep", or is it fine as-is? The PR that I submitted can be amended to remove the introduced emptykey option.

mvdan commented 5 years ago

I'd like to hear from the other encoding/json owners before we merge that CL - ping @rsc @dsnet. The proposed change seems fine to me in principle, although I can't think of where I'd take advantage of it myself.

dsnet commented 5 years ago

This change seems too narrow-sighted. An emptykey option permits one currently invalid key among many other keys that are still not allowed. For example, what if I wanted a comma to be in my JSON key? Currently that isn't allowed since commas are used as the delimiter between the different tag arguments. Why is an empty string as a key more special than a key with commas?

A more generalized solution might be to allow a quoted string as the name argument:

Field1 int `json:"\"\",string"` // JSON name is ``
Field2 int `json:"\"foo,bar\",string"` // JSON name is `foo,bar`
Field3 int `json:"\"foo\\\"bar\",string"` // JSON name is `foo"bar`

However, this looks ugly since it requires double escaping.

Personally, I'm okay with the fact that the json package can't emit every possible key via struct tags. Using map[string]interface{} is a possible workaround.

pwxc commented 5 years ago

From #29476 .

I think it's a great pity that we can't use Chinese punctuation as json's key. But in fact, I still have other ways to avoid this situation.

What I encountered was that I had to add to the key, but that didn't work. So I used two < to simulate, but < were automatically converted to \u003e(And I dont know why) . So I can only ask other developers in the group not to use .

andybons commented 5 years ago

Marking as NeedsDecision since it's not clear what is the correct solution to this issue.

mvdan commented 4 years ago

If anyone does want to take a crack at allowing arbitrary utf-8 as JSON keys, please read @dsnet's comment at https://github.com/golang/go/issues/39189#issuecomment-632467837.

I personally agree with the earlier comment here that it's okay for encoding/json to not have built-in support for this. Allowing nested quoting would be a possibility, but also a pretty ugly one. I don't see a nice way to fit this into the current API, but perhaps others have ideas on how a different API could work.

networkimprov commented 4 years ago

Allow \uNNNN sequences?

Tnze commented 1 year ago

There is also no reason why commas are not allowed in JSON keys. Using commas in JSON keys is very common:

{"username,version1": "Foo"}

There is no any way for us to correctly unmarshal the JSON above to a structure:

type Data struct {
    UsernameV1 string `json:"username,version1"`
    // UsernameV1 string `json:"username\,version1"`
    // UsernameV1 string `json:"username\\,version1"`
}

Can't we support any escape seq here?

Tnze commented 1 year ago

Or we can add a new tag jsonkey for raw json keys without options. And then give it a higher priority.

type Data struct {
    /*                       Options here       Key, supports comma       */
    UsernameV1 string `json:",omitempty" jsonkey:"username,version1"`
}
dblock commented 1 year ago

We have run into this in https://github.com/opensearch-project/OpenSearch/issues/8744.