MiSawa / xq

Pure rust implementation of jq
MIT License
333 stars 18 forks source link

Handle large integer values losslessly? #152

Open travisbrown opened 2 years ago

travisbrown commented 2 years ago

Currently these are formatted with scientific notation (in some cases lossily). For example (from a user JSON object from the Twitter API):

$ xq .id < twitter-test.json 
1.470944601309528e18
$ jq .id < twitter-test.json 
1470944601309528000
$ gojq .id < twitter-test.json 
1470944601309528072

Is this intentional? I'm currently using gojq instead of jq specifically because of how it handles values like this, and the lossless approach seems like it would generally be the least likely to cause issues for users.

MiSawa commented 2 years ago

That is because xq and probably jq and most of other things that treats JSON use the double-precision floating point number to represent a JSON Number. Since double can't represent integers out of [-2^53+1,2^53-1] range precisely, 1470944601309528072, 1.470944601309528e18 and 1470944601309528000 results in the same double number (assuming some rounding mode). gojq does special handling on integers to handle such use-cases, but I dropped that support since

MiSawa commented 2 years ago

Though I see a value of it. Maybe good to do treat integer-looking input as-is as much as possible when it is specified to do so? (related: #93, #82)

travisbrown commented 2 years ago

@MiSawa Thanks for the reply!

For me personally the general principle that I'd prefer in most contexts is that the tool should not change values that the user did not specifically ask to be transformed.

I just learned that this is what jq has done for numeric values for a couple of years in the master branch (although not in the latest official release). For example:

$ jq <<< "0.0001000" 
0.0001000
$ jq <<< '18276318.736187263187638172'
18276318.736187263187638172
$ jq <<< '10000000000000000000000000000000000000012'
10000000000000000000000000000000000000012

(gojq gives the same result for the integral value, but drops the trailing zeros on the first example, and rounds the second.)

MiSawa commented 2 years ago

Ah interesting, they have decimal number calculation introduced, so it's not just preserve user's input as a string but actually treat them as a decimal number with precision given. https://github.com/stedolan/jq/tree/master/src/decNumber

$ ./jq <<< '0.1010e2'
10.10