golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
123.75k stars 17.63k forks source link

encoding/json: no way to preserve the order of map keys #27179

Open lavalamp opened 6 years ago

lavalamp commented 6 years ago

The encoding/json library apparently offers no way to decode data into a interface{} without scrambling the order of the keys in objects. Generally this isn't a correctness issue (per the spec the order isn't important), but it is super annoying to humans who may look at JSON data before and after it's been processed by a go program.

The most popular yaml library solves the problem like this: https://godoc.org/gopkg.in/yaml.v2#MapSlice

ALTree commented 6 years ago

This was asked in the past. Last week: #27050, and an old issue: #6244, where brad wrote:

JSON objects are unordered, just like Go maps. If you're depending on the order that a specific implementation serializes your JSON objects in, you have a bug.

While it's unlikely that the current behaviour of the json package will change, a proposal to add some new mechanism that preserves order (similar to the one in the yaml package) may be accepted. It would help, though, to have something more concrete to look at, i.e. a complete, full fledged proposal explaining what the new piece of API would look like.

lavalamp commented 6 years ago

My search failed to turn up either of those issues, thanks.

I tried to make it clear that this is not about correctness, it's about being nice to human readers. If that's not a relevant concern then we can close this. If there is some chance of a change being accepted then I can make a proposal.

ALTree commented 6 years ago

If there is some chance of a change being accepted then I can make a proposal.

I don't know if there is but the package owners can probably tell you: cc @rsc @dsnet @bradfitz

mvdan commented 6 years ago

There is a bit of precedent to making things nicer for human readers. For example, fmt sorts maps when printing them, to avoid random ordering in the output.

The fmt and json packages are different, and sorting keys isn't the same as keeping the original order, but I think the purpose is similar. Making the output easy to read or modify by humans.

I imagine there's no way to do this under the hood or by simply adding extra API like Encoder.SetIndent, as native maps simply don't keep the order.

Adding a type like MapSlice sounds fine to me - I'd say file a proposal unless someone else comes up with a better idea in the next few days. There's the question of whether the proposal will get past "publish this as a third-party package", but otherwise it seems like a sound idea.

mvdan commented 6 years ago

Similar proposal in the past - keeping the order of the headers in net/http: https://github.com/golang/go/issues/24375

Just like in this case, the big question was where to store the actual order of the map keys.

dsnet commented 6 years ago

Strictly speaking, you can do this today using the raw tokenizer, that json.Decoder provides. That said, I think there will be comprehensive review of all json proposals and issues in the Go1.12 cycle. I can imagine solutions for this issue that also address problems of not being able to detect duplicate keys in objects.

lavalamp commented 6 years ago

I think the proposal would look like a flag (.SetPreserveOrder()) on the decoder that makes a MapSlice instead of a map[string]interface{}, plus making the encoder handle that type.

@dsnet Yeah, that solves the input half of the problem, but it's very inconvenient.

dsnet commented 6 years ago

Alternatively, it could output as [][2]interface{}, in which case you won't need to declare a new type in the json package.

lavalamp commented 6 years ago

I actually like that a lot. It should probably still be declared in the json package just for documentation purposes.

rsc commented 6 years ago

For now it seems like the best answer is a custom type (maybe a slice of pairs) with an UnmarshalJSON method that in turn uses the tokenizer.

Zamiell commented 5 years ago

For the people who stumble upon this issue from Google, the following two libraries (pick one) can help you if you need an ordered JSON map:

https://gitlab.com/c0b/go-ordered-json https://github.com/iancoleman/orderedmap

Also, for reference, see this common StackOverflow answer: https://stackoverflow.com/questions/25182923/serialize-a-map-using-a-specific-order

Of course, it would be fantastic if this were eventually part of the standard library, so I'll eagerly await a proposal from someone more proficient than I.

roshangade commented 5 years ago

As per JSON specification, order does not matter. Reference: https://tools.ietf.org/html/rfc7159#section-1

Other technologies, which use JSON, where order matters. Example: GraphQL Reference: https://graphql.github.io/graphql-spec/June2018/#sec-Serialized-Map-Ordering

It's a much-needed enhancement.

astleychen commented 5 years ago

Feature needed to simplify our implementation on ordered serialization/deserialization.

eliben commented 4 years ago

Strictly speaking, you can do this today using the raw tokenizer, that json.Decoder provides. That said, I think there will be comprehensive review of all json proposals and issues in the Go1.12 cycle. I can imagine solutions for this issue that also address problems of not being able to detect duplicate keys in objects.

We just ran into a case where the order of fields in a JSON file was important, and this code snippet was helpful. However, when order matters not only in the top-level JSON object but also in deeper nested objects, the need to preserve order complicates the code significantly - instead of "just" unmarshaling the object we have to do a series of piece-meal unmarshals to json.RawMessage so that we can use the unparsed byte stream at the right level.

@polinasok

ake-persson commented 4 years ago

Using MapSlice in JSON.

type MapItem struct {
        Key, Value interface{}
}

type MapSlice []MapItem

func (ms MapSlice) MarshalJSON() ([]byte, error) {
        buf := &bytes.Buffer{}
        buf.Write([]byte{'{'})
        for i, mi := range ms {
                b, err := json.Marshal(&mi.Value)
                if err != nil {
                        return nil, err
                }
                buf.WriteString(fmt.Sprintf("%q:", fmt.Sprintf("%v", mi.Key)))
                buf.Write(b)
                if i < len(ms)-1 {
                        buf.Write([]byte{','})
                }
        }
        buf.Write([]byte{'}'})
        return buf.Bytes(), nil
}

Complete example with unmarshal in Go Playground

As a package mapslice-json.

dfurtado commented 4 years ago

Hi, I have a question about this issue.

I have been looking at the source code a bit and I found that the function func (me mapEncoder) encode(e *encodeState, v reflect.Value, opts encOpts) in the encode.go actually sort the keys by doing:

sort.Slice(sv, func(i, j int) bool { return sv[i].s < sv[j].s })

Here's a bigger snippet:

// Extract and sort the keys.
keys := v.MapKeys()
sv := make([]reflectWithString, len(keys))
for i, v := range keys {
    sv[i].v = v
    if err := sv[i].resolve(); err != nil {
        e.error(fmt.Errorf("json: encoding error for type %q: %q", v.Type().String(), err.Error()))
    }
}
sort.Slice(sv, func(i, j int) bool { return sv[i].s < sv[j].s })

Why this sort is done in first place? Is there any reason or did I miss something?

I was just giving a go and removed line and the map[string]interface{} go serialized correctly.

cc @rsc @dsnet @bradfitz

Thanks!!! =)

mvdan commented 4 years ago

@dfurtado json encoding should be deterministic. Ranging over a map has an unspecified order, so there isn't a stable order we can use by default. So the encoder falls back to the equivalent of sort.Strings.

andig commented 4 years ago

I would like to pick up the discussion in the light of go2.

It seems to me one possible solution would be to enable maps to support ordered keys. This would help multiple use cases, for example with url.Values which is a map but unable to maintain the initialization order during Values() which in turn leads to invalid output for some APIs (if the API is well-designed is not point of the discussion here).

Or add an ordered map type and use it in the standard library where applicable.

I've not found a go2 proposal for maintaining map key order- would this be the point to do so?

mvdan commented 4 years ago

I've not found a go2 proposal for maintaining map key order- would this be the point to do so?

I would say you should use a separate issue - you could point to this issue as a potential use case, but a major language change is far more invasive than adding a feature to the json package.

adrianre12 commented 3 years ago

I have just hit this problem too. After reading/editing/saving a json file three different applications (written in c++, C#, and Java) would not read it due to the order being changed.

ryanc414 commented 3 years ago

We ran into this issue, where a third-party API we use requires us to extract an object from a JSON response and return it completely unchanged in a subsequent request body, including not changing the key ordering. Our workaround was to treat the object as an opaque byte slice like:

type rawJsonObject []byte

func (o rawJsonObject) MarshalJSON() ([]byte, error) {
    return []byte(o), nil
}

func (o *rawJsonObject) UnmarshalJSON(data []byte) error {
    *o = data
    return nil
}

Not sure if there is any better way to achieve the same thing. Is there any reason why the JSON marshaller cannot do this for []byte types automatically without the need for a custom type and marshal/unmarshal functions?

mvdan commented 3 years ago

@ryanc414 see https://golang.org/pkg/encoding/json/#RawMessage.

xpol commented 2 years ago

The order of the map keys are important when I want to update a json file using a go program. After update, I hope the json keys are keep the same order so I don't get unnecessary git changes for the json file.

kkqy commented 1 year ago

hey, I defined a simple type to solve the problem.

https://github.com/kkqy/gokvpairs

Just replace your "map[string]type" into "KeyValuePairs[type]" ("type" is type of your value which also can be interface{} ) ,and treat them as a slice of KeyValuePair.

It can deal with both Marshal and Unmarshal automatically.

wk8 commented 1 year ago

https://github.com/wk8/go-ordered-map can do that

7sDream commented 1 year ago

I recently encountered this issue, after investigating the existing solution/project, unfortunately I found that none of them fully met my three needs: preserve order, allow duplicated key, lossless number.

So I wrote my own solution, It looks like:

data := `{"a": 1.1234543234343238726283746, "b": false, "c": null, "b": "I'm back"}`
result, _ := geko.JSONUnmarshal([]byte(data), geko.UseNumber(true))
output, _ := json.Marshal(result)
fmt.Printf("Output: %s\n", string(output))
// Output: {"a":1.1234543234343238726283746,"b":false,"c":null,"b":"I'm back"}

object := result.(geko.ObjectItems)
fmt.Printf("b: %#v\n", object.Get("b"))
// b: []interface {}{false, "I'm back"}

In the case of someone happens to have the same scenario: geko and it's document.

yuki2006 commented 1 year ago

The JSON protocol does not depend on the order of the fields, however, it is convenient when the order is aligned during testing.

Therefore, it would be appreciated if you could align the order in the same way as the input, or alternatively, enhance the functionality for testing purposes.

imReker commented 7 months ago

Go team of Google: You should NOT rely on order of JSON Object, it's written in RFC. Firebase team of Google: You SHOULD rely on order of JSON Object.

Reference: https://firebase.google.com/docs/reference/remote-config/rest/v1/RemoteConfig#RemoteConfigParameterValue

ahmouse15 commented 2 months ago

It seems this issue is present in many parts of the standard library, including the Header map of http ( as mentioned in #24375). Even though most specs - such as JSON and HTTP - don't require order to be preserved, many applications do rely on ordering. In my case, my go program cannot access an API (of a very large tech company) because it expects HTTP headers to be listed in a specific order. Whether it is a bug or a means of preventing unwanted access remains to be seen, but since it is not technically a public API, I have no method of appealing for this to be fixed.

While it is a bug to expect a certain order in these types of programs, it still seems that order should be preserved in many parts of the networking-related portions of the golang standard library to improve compatibility with many APIs. It is annoying how subtle it can be to track down these types of problems.

dsnet commented 2 months ago

@ahmouse15 You are correct that this is a wider stdlib issue, but a more fundamental problem is that there is no native Go type that can nicely preserve order. A Go map doesn't work because it doesn't preserve order. A Go slice is only half the solution since we don't have a consistent way to represent key-value tuples. A solution for this might depend on #63221 or related proposals, otherwise each package would re-invent different ways to represent the key-value tuple.

For the time being, I'm not convinced that this needs to be a first-class feature in the stdlib.

BTW, the v2 "json" package contains an example of how to implement this functionality on your own: https://pkg.go.dev/github.com/go-json-experiment/json#example-package-OrderedObject

It might also make sense to declare a helper data structures in a third-party "jsonx" package that assists with odd (but unfortunately real) schemas such as:

ryanc414 commented 2 months ago

@ahmouse15 You are correct that this is a wider stdlib issue, but a more fundamental problem is that there is no native Go type that can nicely preserve order. A Go map doesn't work because it doesn't preserve order.

I would say that providing an ordered map type in the stdlib would be a good idea, similar to how Python provided the OrderedDict type before the builtin dict became insertion-ordered by default. Of course the interface would differ from the standard map since we can't override the [] operator but it could still be useful in certain cases.