woodruffw / toml2json

A very small CLI for converting TOML to JSON
https://crates.io/crates/toml2json
MIT License
63 stars 8 forks source link

How are TOML `inf` and `nan` encoded in JSON? #88

Open miccoli opened 2 years ago

miccoli commented 2 years ago

I'm one of those poor souls that rely on Python ability to json encode/decode IEEE 754 infinity and nan, see Infinite and NaN Number Values.

Since these are supported in TOML, see Float in the TOML specs, I wonder if toml2json could emulate this code:

import json
import tomli

tomsf = """
# infinity
sf1 = inf  # positive infinity
sf2 = +inf # positive infinity
sf3 = -inf # negative infinity

# not a number
sf4 = nan  # actual sNaN/qNaN encoding is implementation-specific
sf5 = +nan # same as `nan`
sf6 = -nan # valid, actual encoding is implementation-specific
"""
print(json.dumps(tomli.loads(tomsf), indent=4))

which prints

{
    "sf1": Infinity,
    "sf2": Infinity,
    "sf3": -Infinity,
    "sf4": NaN,
    "sf5": NaN,
    "sf6": NaN
}

Compare this with

#!/bin/bash

toml2json --pretty << 'EOF'
# infinity
sf1 = inf  # positive infinity
sf2 = +inf # positive infinity
sf3 = -inf # negative infinity

# not a number
sf4 = nan  # actual sNaN/qNaN encoding is implementation-specific
sf5 = +nan # same as `nan`
sf6 = -nan # valid, actual encoding is implementation-specific
EOF

which prints

{
  "sf1": null,
  "sf2": null,
  "sf3": null,
  "sf4": null,
  "sf5": null,
  "sf6": null
}
woodruffw commented 2 years ago

I'm one of those poor souls that rely on Python ability to json encode/decode IEEE 754 infinity and nan, see Infinite and NaN Number Values.

That's an unfortunate thing to rely on, since it's not in the JSON RFC 🙂 . It looks like a handful of other JSON implementations do the same thing (Ruby's), but it's completely unspecified and isn't legal JSON according to the spec.

This would probably be very difficult (and maybe not desirable) to support in toml2json, since this tool is just a very thin wrapper around the toml and serde_json traits. The latter is very strict about spec conformance and doesn't have any configurable knobs for emitting things that outright aren't valid JSON.

The closest thing would probably be a custom float serializer that emits string representations for NaNs and +/- Infinity, which I'm not necessarily opposed to. But it would have to be a non-default behavior hidden behind a flag, since it fundamentally violates the well-typedness of the output.

miccoli commented 2 years ago

This would probably be very difficult (and maybe not desirable) to support in toml2json, since this tool is just a very thin wrapper around the toml and serde_json traits.

Fair enough! Let me add just elaborate a little bit on this issue.

Of course I'm aware that it is impossible to map all valid TOML to valid JSON while preserving semantics. In my opinion mapping (inf, nan) → null is for sure a sensible choice, but this should be documented to the potential users of this script, that, like me, are not rust programmers.

Maybe also the possibility to have a --strict mode, in which the translator errors out if semantics is not preserverd, could be useful.

In fact, when using toml2json in a bash script I forgot that deep in the TOML file there was a inf, and the corresponding script failure was hard to understand, at a first glance. On the contrary a toml2json failure with a clear error message (Line xxx: TOML yyy cannot be translated to JSON) would have been very useful.

Summing up

and if feasable while preserving the nice light approach of this utility

miccoli commented 2 years ago

Hope you don't mind, but I changed the title of this issue in something more useful, I hope.