Open ethanvc opened 1 month ago
The Multiline
option used to be called Expand
, which somewhat avoided this oddity. \cc @veqryn
The documentation for Multiline
says:
the JSON output should expand to multiple lines, where every JSON object member or JSON array element appears on a new, indented line according to the nesting depth.
In other words, it affects the emission of newlines within top-level JSON values, and does not affect the emission of newlines between top-level JSON values.
It is a valid use-case to have output where each top-level JSON value be non-multiline, but separated by newlines:
{ "index" : { "_index" : "test", "_id" : "1" } }
{ "field1" : "value1" }
{ "delete" : { "_index" : "test", "_id" : "2" } }
{ "create" : { "_index" : "test", "_id" : "3" } }
{ "field1" : "value3" }
{ "update" : {"_id" : "1", "_index" : "test"} }
{ "doc" : {"field2" : "value2"} }
Such a format is usually called "newline delimited JSON".
Furthermore, the current semantic matches the behavior of the v1 "json" package.
Thus, the current semantic is reasonable, though the naming of the option may need to be adjusted to be clear that it only takes effect within a top-level value.
The Marshal
and MarshalWrite
functions internally avoid the trailing newline because they use an unexported OmitTopLevelNewline
option. Perhaps exposing this option is what you really want?
One reason we did not expose OmitTopLevelNewline
is because of the following edge case:
func main() {
var buf bytes.Buffer
encoder := jsontext.NewEncoder(&buf, jsontext.OmitTopLevelNewline(true))
encoder.WriteToken(jsontext.Uint(98765))
encoder.WriteToken(jsontext.Uint(43210))
fmt.Println(buf.String())
}
What should this do?
43210
? (i.e., we only permit encoding a single top-level token)9876543210
in which case it is impossible to parse back the original input?98765 43210
, where we special-case putting a whitespace in-between JSON numbers to avoid any ambiguities?An alternative API is an option that emits newlines between top-level values, instead of after every top-level value. The distinction is subtle, but accomplishes what you would want:
between | after |
---|---|
"{}" | "{}\n" |
"{}\n{}" | "{}\n{}\n" |
"0" | "0\n" |
"0\n1" | "0\n1\n" |
One reason we did not expose
OmitTopLevelNewline
is because of the following edge case:func main() { var buf bytes.Buffer encoder := jsontext.NewEncoder(&buf, jsontext.OmitTopLevelNewline(true)) encoder.WriteToken(jsontext.Uint(98765)) encoder.WriteToken(jsontext.Uint(43210)) fmt.Println(buf.String()) }
What should this do?
- Report an error when writing
43210
? (i.e., we only permit encoding a single top-level token)- Emit
9876543210
in which case it is impossible to parse back the original input?- Emit
98765 43210
, where we special-case putting a whitespace in-between JSON numbers to avoid any ambiguities?
I think the first option is good and if we want write multi top json value in one buffer, we should do it explicitly.
An alternative API is an option that emits newlines between top-level values, instead of after every top-level value. The distinction is subtle, but accomplishes what you would want:
between after "{}" "{}\n" "{}\n{}" "{}\n{}\n" "0" "0\n" "0\n1" "0\n1\n"
Yes, this is what i want, and i think is more reasonable.
so, what's the plan for this topic? @dsnet
I agree this is something to address, but I think we'll defer addressing it until the v2 work lands in stdlib. There's already enough other work to do in preparation for that.
This behavior is not expected by library users. Expected output: "{}" without the tail \n.