JuliaIO / JSON.jl

JSON parsing and printing
Other
313 stars 101 forks source link

RFC: Add `allownan` keyword argument to parse() (Fixes #168) #280

Closed kmsquire closed 5 years ago

kmsquire commented 5 years ago

The ideal solution to reading out-of-spec JSON files was proposed by @TotalVerb in #169.

Until that is implemented, this provides a pragmatic solution for people to read these types of JSON files when they don't have control over the file generation.

kmsquire commented 5 years ago

There are no tests yet. I'll wait for any discussion on whether or not this is desired first.

ararslan commented 5 years ago

Seems okay to me in principle but I'm not all that familiar with the JSON spec nor the internals of this package so I'm probably not the best person to review this.

TotalVerb commented 5 years ago

I think that if we have a permissive mode like this we should also support the Python output nan, inf, -inf (note case). I'm not sure if any Python code generates this output though.

kmsquire commented 5 years ago

I think that if we have a permissive mode like this we should also support the Python output nan, inf, -inf (note case). I'm not sure if any Python code generates this output though.

According to the Python json package docs, it outputs the JavaScript values:

If allow_nan is false (default: True), then it will be a ValueError to serialize out of range float values (nan, inf, -inf) in strict compliance of the JSON specification. If allow_nan is true, their JavaScript equivalents (NaN, Infinity, -Infinity) will be used.

simplejson and rapidjson follow the same convention. ujson doesn't output any of these values.

Google's JavaScript GSON module also accepts NaN, Infinity, -Infinity. It does not output them by default, but allows this behavior to be overridden (see https://static.javadoc.io/com.google.code.gson/gson/2.6/com/google/gson/GsonBuilder.html#serializeSpecialFloatingPointValues()).

Given this, I don't think we need to support standard Python values in JSON.

kmsquire commented 5 years ago

While looking through some of the packages in the previous comment, I ran across this snippet (from https://tools.ietf.org/html/rfc7159.html#section-9):

   A JSON parser transforms a JSON text into another representation.  A
   JSON parser MUST accept all texts that conform to the JSON grammar.
   A JSON parser MAY accept non-JSON forms or extensions.

Given this, I think what we're doing here is not too unreasonable.

That said

kmsquire commented 5 years ago

I'll add tests, and see if anyone else wants to review.

kmsquire commented 5 years ago
  1. I added tests
  2. I refactored the code to reduce code duplication. @TotalVerb, can you review again if you get the chance?

The codecov results are a bit hard to understand--it looks like we're not providing the right line numbers, and/or some of the code that was added was simply inlined later.

kmsquire commented 5 years ago

Okay, thanks @TotalVerb. Merging! (Edit: after squashing and running CI to make sure I didn't screw up the squash...)