quinnj / JSON3.jl

Other
217 stars 48 forks source link

Support user-defined functions for serialising Inf and NaN #292

Open hhaensel opened 2 weeks ago

hhaensel commented 2 weeks ago

Currently, Inf and NaN are translated to Infinity and NaN if allow_inf = true is passed to JSON.write().

Unfortunately, the standard JSON parser in the browser does not support this syntax. Typical workarounds are regex substitution of Infinity to "Infinity", which is slow and error-prone.

If only Inf translation is needed, a nice hack is to translate Inf to 1e1000 which is converted to Infinity by the built-in number parser rather than by the JSON parser. If also NaN is needed, the only possibility I am aware of is via reviver. But as there is no standard format, I could propose, I thought a customisable solution would be nice.

I propose to support user-defined translations via value types; Inf, -Inf and NaN are output as RawType() for which the users can define their own values, e.g.

JSON3.rawbytes(::Val{Inf}) = codeunits("1e1000")
JSON3.rawbytes(::Val{-Inf}) = codeunits("-1e1000")
JSON3.rawbytes(::Val{NaN}) = codeunits("__nan__")

so that

julia> JSON3.write((a = Inf, b = -Inf32, c = NaN), allow_inf = true)
"{\"a\":1e1000,\"b\":-1e1000, \"c\":\"__nan__\"}"

I've prepared a PR, which I will submit for consideration. There is a slight perfomance reduction of 1% vs. the existing treatment of Inf, while NaNs are treated a bit faster. I'd consider the changes negligable, given the fact that the occurrence of the these values rather low.

hhaensel commented 2 weeks ago

... or would you rather prefer a version via keyword argument?

hhaensel commented 2 weeks ago

I've added another version with keyword argument under the branch hh-infinity2. I first tried a Dict mapping but that performed way slower, then I went with a functional mapping.

julia> mapping(x) = x == Inf ? "__inf__" : x == -Inf ? "-1e1000" : "__nan__"
julia> JSON3.write([Inf32,-Inf32, NaN], inf_mapping = mapping)
"[Infinity,-Infinity,NaN]"
hhaensel commented 2 weeks ago

One thought; could it be that a number is not finite but also not NaN and not Inf? The the default mapping should probably rather look

_std_mapping(x) = x == Inf ? "Infinity" : x == -Inf ? "-Infinity" : isnan(x) ? "NaN" : string(x)
hhaensel commented 2 weeks ago

After some re-thinking, I have a slight preference for the kwarg-solution, because users could serialize for different purposes/backends in one application.