federomero / pretty-json

Atom plugin. Format JSON documents.
MIT License
94 stars 23 forks source link

Feature Request: Prevent side effects that change the format of Numbers #46

Open kwerle opened 8 years ago

kwerle commented 8 years ago

{"foo": 6.0}

->

{ "foo": 6 }

lexicalunit commented 8 years ago

Numbers in JavaScript are always 64-bit Floating Point values. The trailing .0 here does not have any semantic value; for example it does not indicate integer vs float as it would in a language like C. Part of prettification is standardizing the format of things and this includes numbers, however this formatting is not pretty-json's doing. The formatting you see here is actually built into your JavaScript's stringification implementation for numbers, which is in turn most likely based on the mathematical standard form for decimals. For example, directly from my JavaScript console:

>>> number = 6.0
6
>>> number = 6.1234
6.1234
kwerle commented 8 years ago

OK, that's true of javascript, and I know what JSON stands for - but javascript is not the only language that uses json. Even humans have been known to read JSON. And 6.00 certainly means something different than 6 in many contexts.

Offhand I know that ruby and python both json encode {foo: 6.0} as {"foo": 6.0}. I would be surprised if most languages don't do it that way.

I really think this is a shortcoming in pretty-json (which is a package I love and have used since it was available).

lexicalunit commented 8 years ago

https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/JSON/stringify

Boolean, Number, and String objects are converted to the corresponding primitive values during stringification, in accord with the traditional conversion semantics.

Theoretically if you were working directly with the stringify function in code you could pass in a replacer function that processed Number objects differently. As for ruby and python, both of those languages have distinct floating point and integral data types, so it makes sense that their default JSON encoding algorithm keeps the trailing insignificant digits. Thankfully since all Numbers in JS are floating point, it doesn't hurt anything for ruby and python to do this. 6, 6.0, and 6.00 are all valid representations of 6 as far as JSON is concerned.

However if any language acted differently when parsing 6.0 and 6 from JSON data, that would be a bug because the only thing you can assume when reading a number in JSON is that it is a 64-bit floating point number because that is all that JavaScript claims.

kwerle commented 8 years ago

In ruby, for example:

h = JSON.parse '{"foo": 6.0}' { "foo" => 6.0 } h["foo"].class Float < Numeric h = JSON.parse '{"foo": 6}' { "foo" => 6 } h["foo"].class Fixnum < Integer

So ruby certainly does treat the parsing of 6.0 differently than 6. And it does it in a way that I expect.

But mostly I expect a "pretty" function to not alter anything but whitespace (and maybe coloring).

lexicalunit commented 8 years ago

My guess is that ruby doesn't have a universal Number data type like JS does and the ruby parser is making its best guess as to how to represent the JSON data when translated into ruby data structures. Of course a user might represent numbers as strings in JSON, for example when representing money, as to avoid IEEE floating point representation limitations. So the final translation of JSON data will be done in the client code according to the author's knowledge of the JSON data itself. Lexical casting is not an uncommon operation when working with JSON data in a language with strict data types (see C and C++). I tested similar code in Python and it does the same thing. Ruby and Python are very similar languages so this isn't too surprising.

Upon further thinking, it is perfectly safe to represent 6 as an int as there's no significand. It's also not difficult in most languages to translate from an integer to a floating point without loosing information. In Python for example you'd probably never care either way so long as you used the correct mathematical operations according to your use case (integer vs floating point division, for example).

However Atom is built on JS technology so we can't rely only these cute JSON parser features saving us from the reality of JSON data. JSON stringification of Number data will likely not change anytime soon, and there's little I can do to workaround that.

lexicalunit commented 8 years ago

Out of curiosity I just went out and did a quick survey of JSON formatting libraries available via node and all of them operate exactly the same as pretty-json does.

https://www.npmjs.com/package/json-honey

> var honey = require('json-honey')
> honey('{"foo": 6.0}')
{ foo: 6 }
> honey({"foo": 6.0})
'{\n  "foo": 6\n}'

https://www.npmjs.com/package/jsonpretty

> jp('{"foo": 6.0}')
'"{\\"foo\\": 6.0}"'
> jp({"foo": 6.0})
'{\n  "foo": 6\n}'

https://www.npmjs.com/package/json-pretty

> jsonPretty('{"foo": 6.0}')
'"{\\"foo\\": 6.0}"'
> jsonPretty({"foo": 6.0})
'{\n  "foo": 6\n}'

https://www.npmjs.com/package/json-format

> jsonFormat('{"foo": 6.0}')
'"{\\"foo\\": 6.0}"'
> jsonFormat({"foo": 6.0})

https://www.npmjs.com/package/ppjson

$ cat d.json
{
"foo": 6.0
}
$ ppjson < d.json
{
    foo: 6
}
kwerle commented 8 years ago

While that's disappointing, I completely understand it and appreciate the discussion. It's not like I'm going to stop using pretty-json - it's a great package; I'll just keep this in mind.

It might be worth a note, somewhere, stating that this is a side-effect of converting.

Thanks!

lexicalunit commented 8 years ago

That's a good idea. Honestly I was expecting to find at least one library out there that would implement formatting in such a way that it maintained Number value formatting. Like if there was a package out there that didn't rely on JSON.stringify() to handle values, it would almost certainly maintain formatting as a side effect, if not as an intentional feature.

lexicalunit commented 8 years ago

Reopening this with a invitation for anyone to take a crack at this themselves. There may be a library out there that I don't know about that would resolve this issue. I don't have the time to take a look myself right now, but I'm totally willing to code review any PRs y'all come up with.

@kwerle This open issue will serve as documentation to the fact that prettification/minification currently does have side effects for the formatting of Numbers.

qiaojianjack commented 7 years ago

I also spotted this when I was using pretty-json. Really appreciate all the discussions here, however I found out that some of the json formatter websites do not change the format of numbers:

e.g. https://jsonformatter.curiousconcept.com/ (#1 on google search "json formatter") http://jsonviewer.stack.hu/

I'm not familiar with JavaScript or web technology, but I assume they should be also using some JS technology to achieve this since they are websites?

lexicalunit commented 7 years ago

I messaged the owners of the site but it seems their code is closed source. What I think it does tho is just step over the text applying some simple transformations to it. This allows it to be a bit more robust in the face of JSON syntax errors. For example if you put invalid JSON in there like

{"foo": 42,}

it will still happily format it for you as:

{  
   "foo":42,
}

while also pointing out the error: Invalid comma, expecting }. This is pretty neat and a totally different approach than any of the JSON formatters I've found available via npm. Those and pretty-json rely on parsing valid JSON and then stringifying it, which leads to data representation issues.

Unfortunately, I don't have the time to re-implement whatever algorithm sites like https://jsonformatter.curiousconcept.com/ are using. I also tried searching for an open source library similar to their solution but I couldn't find one.