go-json-experiment / json

Experimental implementation of a proposed v2 encoding/json package
BSD 3-Clause "New" or "Revised" License
341 stars 11 forks source link

JSON Serialization (v2)

GoDev Build Status

This module hosts an experimental implementation of v2 encoding/json. The API is unstable and breaking changes will regularly be made. Do not depend on this in publicly available modules.

Any commits that make breaking API or behavior changes will be marked with the string "WARNING: " near the top of the commit message. It is your responsibility to inspect the list of commit changes when upgrading the module. Not all breaking changes will lead to build failures.

A Discussion about including this package in Go as encoding/json/v2 has been started on the Go Github project on 2023-10-05. Please provide your feedback there.

Goals and objectives

Expectations

While this module aims to possibly be the v2 implementation of encoding/json, there is no guarantee that this outcome will occur. As with any major change to the Go standard library, this will eventually go through the Go proposal process. At the present moment, this is still in the design and experimentation phase and is not ready for a formal proposal.

There are several possible outcomes from this experiment:

  1. We determine that a v2 encoding/json would not provide sufficient benefit over the existing v1 encoding/json package. Thus, we abandon this effort.
  2. We propose a v2 encoding/json design, but it is rejected in favor of some other design that is considered superior.
  3. We propose a v2 encoding/json design, but rather than adding an entirely new v2 encoding/json package, we decide to merge its functionality into the existing v1 encoding/json package.
  4. We propose a v2 encoding/json design and it is accepted, resulting in its addition to the standard library.
  5. Some other unforeseen outcome (among the infinite number of possibilities).

Development

This module is primarily developed by @dsnet, @mvdan, and @johanbrandhorst with feedback provided by @rogpeppe, @ChrisHines, and @rsc.

Discussion about semantics occur semi-regularly, where a record of past meetings can be found here.

Design overview

This package aims to provide a clean separation between syntax and semantics. Syntax deals with the structural representation of JSON (as specified in RFC 4627, RFC 7159, RFC 7493, RFC 8259, and RFC 8785). Semantics deals with the meaning of syntactic data as usable application data.

The Encoder and Decoder types are streaming tokenizers concerned with the packing or parsing of JSON data. They operate on Token and Value types which represent the common data structures that are representable in JSON. Encoder and Decoder do not aim to provide any interpretation of the data.

Functions like Marshal, MarshalWrite, MarshalEncode, Unmarshal, UnmarshalRead, and UnmarshalDecode provide semantic meaning by correlating any arbitrary Go type with some JSON representation of that type (as stored in data types like []byte, io.Writer, io.Reader, Encoder, or Decoder).

API overview

This diagram provides a high-level overview of the v2 json and jsontext packages. Purple blocks represent types, while blue blocks represent functions or methods. The arrows and their direction represent the approximate flow of data. The bottom half of the diagram contains functionality that is only concerned with syntax (implemented by the jsontext package), while the upper half contains functionality that assigns semantic meaning to syntactic data handled by the bottom half (as implemented by the v2 json package).

In contrast to v1 encoding/json, options are represented as separate types rather than being setter methods on the Encoder or Decoder types. Some options affects JSON serialization at the syntactic layer, while others affect it at the semantic layer. Some options only affect JSON when decoding, while others affect JSON while encoding.

Behavior changes

The v2 json package changes the default behavior of Marshal and Unmarshal relative to the v1 json package to be more sensible. Some of these behavior changes have options and workarounds to opt into behavior similar to what v1 provided.

This table shows an overview of the changes:

v1 v2 Details
JSON object members are unmarshaled into a Go struct using a case-insensitive name match. JSON object members are unmarshaled into a Go struct using a case-sensitive name match. CaseSensitivity
When marshaling a Go struct, a struct field marked as omitempty is omitted if the field value is an empty Go value, which is defined as false, 0, a nil pointer, a nil interface value, and any empty array, slice, map, or string. When marshaling a Go struct, a struct field marked as omitempty is omitted if the field value would encode as an empty JSON value, which is defined as a JSON null, or an empty JSON string, object, or array. OmitEmptyOption
The string option does affect Go bools. The string option does not affect Go bools. StringOption
The string option does not recursively affect sub-values of the Go field value. The string option does recursively affect sub-values of the Go field value. StringOption
The string option sometimes accepts a JSON null escaped within a JSON string. The string option never accepts a JSON null escaped within a JSON string. StringOption
A nil Go slice is marshaled as a JSON null. A nil Go slice is marshaled as an empty JSON array. NilSlicesAndMaps
A nil Go map is marshaled as a JSON null. A nil Go map is marshaled as an empty JSON object. NilSlicesAndMaps
A Go array may be unmarshaled from a JSON array of any length. A Go array must be unmarshaled from a JSON array of the same length. Arrays
A Go byte array is represented as a JSON array of JSON numbers. A Go byte array is represented as a Base64-encoded JSON string. ByteArrays
MarshalJSON and UnmarshalJSON methods declared on a pointer receiver are inconsistently called. MarshalJSON and UnmarshalJSON methods declared on a pointer receiver are consistently called. PointerReceiver
A Go map is marshaled in a deterministic order. A Go map is marshaled in a non-deterministic order. MapDeterminism
JSON strings are encoded with HTML-specific characters being escaped. JSON strings are encoded without any characters being escaped (unless necessary). EscapeHTML
When marshaling, invalid UTF-8 within a Go string are silently replaced. When marshaling, invalid UTF-8 within a Go string results in an error. InvalidUTF8
When unmarshaling, invalid UTF-8 within a JSON string are silently replaced. When unmarshaling, invalid UTF-8 within a JSON string results in an error. InvalidUTF8
When marshaling, an error does not occur if the output JSON value contains objects with duplicate names. When marshaling, an error does occur if the output JSON value contains objects with duplicate names. DuplicateNames
When unmarshaling, an error does not occur if the input JSON value contains objects with duplicate names. When unmarshaling, an error does occur if the input JSON value contains objects with duplicate names. DuplicateNames
Unmarshaling a JSON null into a non-empty Go value inconsistently clears the value or does nothing. Unmarshaling a JSON null into a non-empty Go value always clears the value. MergeNull
Unmarshaling a JSON value into a non-empty Go value follows inconsistent and bizarre behavior. Unmarshaling a JSON value into a non-empty Go value always merges if the input is an object, and otherwise replaces. MergeComposite
A time.Duration is represented as a JSON number containing the decimal number of nanoseconds. A time.Duration is represented as a JSON string containing the formatted duration (e.g., "1h2m3.456s"). TimeDurations
Unmarshaling a JSON number into a Go float beyond its representation results in an error. Unmarshaling a JSON number into a Go float beyond its representation uses the closest representable value (e.g., ±math.MaxFloat). MaxFloats
A Go struct with only unexported fields can be serialized. A Go struct with only unexported fields cannot be serialized. EmptyStructs
A Go struct that embeds an unexported struct type can sometimes be serialized. A Go struct that embeds an unexported struct type cannot be serialized. EmbedUnexported

See diff_test.go for details about every change.

Performance

One of the goals of the v2 module is to be more performant than v1, but not at the expense of correctness. In general, v2 is at performance parity with v1 for marshaling, but dramatically faster for unmarshaling.

See https://github.com/go-json-experiment/jsonbench for benchmarks comparing v2 with v1 and a number of other popular JSON implementations.