itchyny / gojq

Pure Go implementation of jq
MIT License
3.31k stars 119 forks source link

literal numbers #83

Closed pkoppstein closed 3 years ago

pkoppstein commented 3 years ago

First my congratulations on gojq, and for having closed out all the open issues. And thank you especially for adhering so closely to the "jq specification".

With regard to the "jq specification", there is, however, a looming problem, having to do with "numbers".

For clarity's sake, let me distinguish between "jq" and "jqMaster", the latter referring to the current "master" version of jq, and more specifically, to the (new) specification of the treatment of numeric literals.

As you know, the (new) approach to numeric literals (whether specified in JSON texts or in jq programs) is to ensure that they are treated with great respect, although not always "literally". The intent, though, is that jq filters such as .,tostring and tonumber should preserve the numeric accuracy of numeric literals.

Thus, using jqMaster (the executable), we see:

jqMaster -cM '[., tostring, (tostring|tonumber)]'
1111111111111111111111111111111111111111111
[1111111111111111111111111111111111111111111,"1111111111111111111111111111111111111111111",1111111111111111111111111111111111111111111]

00001
[1,"1",1]

1.234567892345678923456789
[1.234567892345678923456789,"1.234567892345678923456789",1.234567892345678923456789]

1.2300000E+1000
[1.2300000E+1000,"1.2300000E+1000",1.2300000E+1000]

So the looming issue stems from the fact that jqMaster honors the mathematical semantics of numeric literals, whereas gojq only does so for integers.

I believe jqMaster's treatment of numbers accords well with what I believe was probably Douglas Crockford's original intention. (That is, I think he probably intended that JSON semantics for "number" would accord with the normal meaning (in English and mathematics), which partly explains why the original specification was so focused on syntax.)

itchyny commented 3 years ago

I think gojq is just another jq-like syntax interpreter in the absence of specification. I'm confident it is highly compatible, but I already fixed some bugs of jqMaster and dropped some unimportant filters. In my honest opinion, number and string normalizations are important features of jq. And I'm not sure how important keeping the precision of floating-point numbers like that. Most JSONs are emitted by another programming language, which likely uses 64bits floating-point numbers. Keeping the floating-point numbers out of 64bits f/p numbers makes the code too difficult, especially for the WithFunction option in gojq. By the way, your second example is related to jq bug (I've seen before but can't find the issue), [001] is an invalid JSON while jq accepts with no error.

pkoppstein commented 3 years ago

I'm not sure how important keeping the precision of floating-point numbers like that.

For reference, jtc also preserves numeric precision for all JSON numbers:

jtc . - <<< 123456789123456789123456789123456789.123456789123456789123456789123456789123456789123456789123456789123456789
123456789123456789123456789123456789.123456789123456789123456789123456789123456789123456789123456789123456789

jtc's author wrote: "jtc ... honors 100% JSON definition for numericals."