francoispqt / gojay

high performance JSON encoder/decoder with stream API for Golang
MIT License
2.11k stars 113 forks source link

Decoder returns misleading error at EOF #134

Open jordansissel opened 4 years ago

jordansissel commented 4 years ago

When EOF is reached unexpectedly, decoder.DecodeObject reports a misleading error:

Invalid JSON, wrong char '' found at position 0

Upon closer inspection, there's actually a character there, a nul byte, which we can see if printed using %#v with fmt.Printf:

Invalid JSON, wrong char '\x00' found at position 0

So, the underlying problem is that EOF is reached by the stream. The error is misleading because it didn't actually read a NUL, it got EOF and read zero bytes, but the code (in several places that I looked) returns an error based on dec.cursor. At the time of the EOF, the cursor had been moved forward to an unused part of the buffer which happens to contain a nul byte value.

This is a problem because when we saw this in production, we began investigating all kinds of possible sources for the stream to produce a NUL byte. We believe what was actually happening was that a network connection was terminated, yielding an EOF, which gojay tells us this misleading error ;)

I have some code that compares behavior of gojay with encoding/json: https://play.golang.org/p/TJtoTwEbAaH

Output:

"" :: gojay DecodeObject Error: "Invalid JSON, wrong char '\x00' found at position 0"
"" :: json.Unmarshal Error: "unexpected end of JSON input"
"" :: json.Decoder.Decode Error: "EOF"
"{" :: gojay DecodeObject succeeded?
"{" :: json.Unmarshal Error: "unexpected end of JSON input"
"{" :: json.Decoder.Decode Error: "unexpected EOF"
"{\"" :: gojay DecodeObject Error: "Invalid JSON, wrong char '\x00' found at position 2"
"{\"" :: json.Unmarshal Error: "unexpected end of JSON input"
"{\"" :: json.Decoder.Decode Error: "unexpected EOF"

Two interesting findings above:

  1. This issue: EOF is reported mistakenly as "wrong char ''"
  2. gojay.DecodeObject succeeds on { which isn't valid JSON (separate problem)

I started looking into patching gojay to correctly report EOF, but there are so many places that decoder.read() is invoked (and could set dec.err to EOF), I wasn't sure how best to get started.