schibsted / jslt

JSON query and transformation language
Apache License 2.0
638 stars 120 forks source link

Precision issue when sum array #333

Open jerrybyte opened 8 months ago

jerrybyte commented 8 months ago

we want to sum the array, but when we used sum() function, it give a wrong result.

Expression : "roundnum" : sum([150.40, 140.51]), Result: "roundnum" : 290.90999999999997,

larsga commented 8 months ago

I'm afraid this is just the nasty nature of floats. Here's what happens if you do the same thing in Python:

Python 2.7.18 (v2.7.18:8d21aa21f2, Apr 19 2020, 20:48:48)
[GCC 4.2.1 Compatible Apple LLVM 6.0 (clang-600.0.57)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> 150.40 + 140.51
290.90999999999997
catull commented 8 months ago

If you want precision, you have to adapt your processing a little.

With this input

{
  "data": [
    150.40,
    140.51
  ]
}

you can apply this transformation:

{
  "roundnum": floor (sum (.data) * 100 + 0.5) / 100
}

The result is:

{
  "roundnum" : 290.91
}
catull commented 8 months ago

Try this, to see the changes applied:

{
  "round1": sum (.data),
  "round2": floor (sum (.data) * 100) / 100,
  "roundnum": floor (sum (.data) * 100 + 0.5) / 100
}

It gives you:

{
  "round1" : 290.90999999999997,
  "round2" : 290.9,
  "roundnum" : 290.91
}
jerrybyte commented 8 months ago

I'm afraid this is just the nasty nature of floats. Here's what happens if you do the same thing in Python:

Python 2.7.18 (v2.7.18:8d21aa21f2, Apr 19 2020, 20:48:48)
[GCC 4.2.1 Compatible Apple LLVM 6.0 (clang-600.0.57)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> 150.40 + 140.51
290.90999999999997

do we have a decimal type to avoid the precision issue ?

jerrybyte commented 8 months ago

Try this, to see the changes applied:

{
  "round1": sum (.data),
  "round2": floor (sum (.data) * 100) / 100,
  "roundnum": floor (sum (.data) * 100 + 0.5) / 100
}

It gives you:

{
  "round1" : 290.90999999999997,
  "round2" : 290.9,
  "roundnum" : 290.91
}

' + 0.5 ' is only valid for the current nunmber... if I change the number, it will not work

catull commented 8 months ago

Actually, it is correct for positive sums. @jerrybyte

Try this:

Input:

{
  "data": [
    [
      150.40,
      140.51
    ],
    [
      150.40,
      140.51,
      10.32
    ],
    [
      150.40,
      140.51,
      10.32,
      15.83
    ]
  ]
}

With this transformation:

def calculate (a) {
  let sum = sum ($a)
  "sum": $sum,
  "sumWith2Digits": floor ($sum * 100) / 100,
  "sumRoundedTo2Digits": floor ($sum * 100 + 0.5) / 100
}

[ for (.data)
  calculate (.)
]

The result is:

[ {
  "sum" : 290.90999999999997,
  "sumWith2Digits" : 290.9,
  "sumRoundedTo2Digits" : 290.91
}, {
  "sum" : 301.22999999999996,
  "sumWith2Digits" : 301.22,
  "sumRoundedTo2Digits" : 301.23
}, {
  "sum" : 317.05999999999995,
  "sumWith2Digits" : 317.05,
  "sumRoundedTo2Digits" : 317.06
} ]
larsga commented 8 months ago

do we have a decimal type to avoid the precision issue ?

What precision are you looking for?