golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
124.16k stars 17.69k forks source link

encoding/json: Unmarshaler breaks DisallowUnknownFields #41144

Open alvaroaleman opened 4 years ago

alvaroaleman commented 4 years ago

What version of Go are you using (go version)?

$ go version
go version go1.15rc1 linux/amd64

Does this issue reproduce with the latest release?

yes

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GO111MODULE=""
GOARCH="amd64"
GOBIN=""
GOCACHE="/home/alvaro/.gocache"
GOENV="/home/alvaro/.config/go/env"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOINSECURE=""
GOMODCACHE="/home/alvaro/git/golang/pkg/mod"
GONOPROXY=""
GONOSUMDB=""
GOOS="linux"
GOPATH="/home/alvaro/git/golang"
GOPRIVATE=""
GOPROXY="direct"
GOROOT="/usr/local/go"
GOSUMDB="off"
GOTMPDIR=""
GOTOOLDIR="/usr/local/go/pkg/tool/linux_amd64"
GCCGO="gccgo"
AR="ar"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build855135854=/tmp/go-build -gno-record-gcc-switches"

What did you do?

Built a custom json.Unmarshaler to default a config file, then used a json.Decoder with DisallowUnknownFields to unmarshal a json representation that had an additional, unknown field. I expected that to fail but it didn't.

https://play.golang.org/p/8cEJU-Y0-9L

What did you expect to see?

An error, pointing out the unknown, additional field in the json document

What did you see instead?

No error

icholy commented 4 years ago

When you call json.Unmarshal in your UnmarshalJSON method, it creates a new decodeState which knows nothing about the top level json.Decoder instance. If you want to error on unknown fields, you need to enforce that in your json.Unmarshaler implementation.

alvaroaleman commented 4 years ago

When you call json.Unmarshal in your UnmarshalJSON method, it creates a new decodeState which knows nothing about the top level json.Decoder instance. If you want to error on unknown fields, you need to enforce that in your json.Unmarshaler implementation.

I know that. But I do not want to generally enforce that. I want to use whatever setting is in the top level json.Decoder instance and it seems its impossible to find that out (or did I miss something?)

mvdan commented 4 years ago

The problem here is that json.Unmarshaler does not have built-in support for options, so there's just no good way to solve this with the current API. You could remember the options on your own and re-apply them in your nested Unmarshal/Decode call, if you want.

I don't think there's a way to fix this in the current API without duplicating multiple chunks of it, or breaking it entirely, which is not an option in Go 1.x. I recently collected my thoughts about an encoding/json redesign, and the issues with the options are near the top of that list, if you're interested.

alvaroaleman commented 4 years ago

The problem here is that json.Unmarshaler does not have built-in support for options, so there's just no good way to solve this with the current API. You could remember the options on your own and re-apply them in your nested Unmarshal/Decode call, if you want.

I don't control all code that calls Unmarshal so there is not really a way for me to remember this, or are there more ways?

This issue is basically about changing the api in a way that does allow passing these options down, for example via an additional, optional interface (UnmarshalerWithOptions?). I don't have a lot of opinion on how that should look like, but I wanted to a) confirm my suspicion, that this is not possible today and b) have a tracking issue for this

mvdan commented 4 years ago

I imagine there's an issue for this already, but I honestly can't find it right now. Optionally expanding the interface is an option, but honestly not a very good one - you then move the problem to fixing all the implementations to do the right thing, which also adds quite a lot of code churn, almost like a switch to a v2 API.

alvaroaleman commented 4 years ago

Optionally expanding the interface is an option, but honestly not a very good one - you then move the problem to fixing all the implementations to do the right thing, which also adds quite a lot of code churn, almost like a switch to a v2 API.

Well as mentioned, I don't have a lot of opinion on how exactly this should look like, but I imagine that a v2 api design will take very long and I'd prefer a not-so-nice solution for this in the foreseeable future over a nice solution in the very far future (also, adding something to the current api won't keep us from implementing something proper in a v2 api)

mvdan commented 4 years ago

Indeed, a redesign could take a while. The problem with not-so-nice solutions is that sometimes they're worse than not doing anything at all, and I think the consensus for now is to mostly freeze the encoding/json API as is. You can look at all the frozen json proposals for examples. @rsc @dsnet might be able to confirm with more authority than me :)

If you want to propose adding to the package API, then you should turn this into a proposal and get it reviewed like any other proposed API changes. You could reuse this issue, or open a new one instead. See https://github.com/golang/proposal

dsnet commented 4 years ago

As @mvdan mentioned, I don't know if there is a specific issue about the top-level option passing problem, but it's one that I've commented about in a number of different encoding/json issues. Adding new features carries O(n!) maintenance cost, since you need to consider how the newly added feature interacts with every feature that has already provided (and all possible combinations of those features). I don't think we should add another UnmarshalerWithOptions interface without carefully thinking about the implications. Just as a high-level overview, such an interface would probably require a dependency on the encoding/json package, which is already a significant divergence from the Unmarshaler interface itself (which has no dependency on anything other than basic types like []byte, int, and error). If we were to force that dependency through the interface, it also brings into question whether the interface should provide more things than just the options. For example, currently calling nested Unmarshalers is O(n^2) since the encoding/json package has to repeatedly parse the entire sub-tree to find the end of the JSON token before it can call UnmarshalJSON. If we were to pass something similar to json.Decoder, it could be much more efficient.

dsnet commented 1 year ago

Hi all, we kicked off a discussion for a possible "encoding/json/v2" package that addresses the spirit of this proposal.

In v2 we propose supporting the following interface:

type UnmarshalerV2 interface {
    UnmarshalJSONV2(*jsontext.Decoder, Options) error
}

Of particular note, this passes down an Options value that can preserve the DisallowUnknownFields semantic. Existing UnmarshalJSON methods obviously can't benefit, but types that implement the UnmarshalJSONV2 method can benefit in the future.