amnaredo / test

0 stars 0 forks source link

ujson cannot handle long values correctly #248

Open amnaredo opened 3 years ago

amnaredo commented 3 years ago

I noticed that when getting numeric values out of a JSON, you can only retrieve the number as a double using the num method. So if I know that I'm dealing with an 64-bit integer, my only option is to do this:

val id = json("some_64_bit_integer").num.toLong

(Also tried .string.toLong, but that resulted in a runtime error because numeric values cannot be queried as strings)

The problem is that this only works reliably up to 53-bit integers, above that we'll silently lose precision! Meaning that there will be no indication of any error whatsoever, but values greater than 2^53 will be just wrong!

One possible solution to this would be to have different methods for all numeric Java datatypes, so byte, short, int, long, float and double instead of just num. We could still use the wrong method and then precision would be potentially lost, but at least then we'd have the option to get the correct values out if we used the correct method.

Additionally, in theory numeric JSON values can represent big numbers as well that cannot fit into the signed 64-bit range of a Java long. So ideally we'd have a bigDecimal method as well.

Check this for a more detailed explanation on the issue (it's a JavaScript related writeup, but don't let that confuse you, they are apparently facing the same precision problems in JS-land because they only have doubles natively :)).

http://2ality.com/2012/07/large-integers.html

ID: 263 Original Author: johnnovak

amnaredo commented 3 years ago

Please include a complete self-contained example code snippet demonstrating the issue

On Fri, 8 Feb 2019 at 7:21 AM, John Novak notifications@github.com wrote:

I noticed that when getting numeric values out of a JSON, you can only retrieve the number as a double using the num method. So if I know that I'm dealing with an 64-bit integer, my only option is to do this:

val id = json("some_64_bit_integer").num.toLong

(Also tried .string.toLong, but that resulted in a runtime error because numeric values cannot be queried as strings)

The problem is that this only works reliably up to 53-bit integers, above that we'll silently lose precision! Meaning that there will be no indication of any error whatsoever, but values greater than 2^53 will be just wrong!

One possible solution to this would be to have different methods for all numeric Java datatypes, so byte, short, int, long, float and double instead of just num. We could still use the wrong method and then precision would be potentially lost, but at least then we'd have the option to get the correct values out if we used the correct method.

Additionally, in theory numeric JSON values can represent big numbers as well that cannot fit into the signed 64-bit range of a Java long. So ideally we'd have a bigDecimal method as well.

Check this for a more detailed explanation on the issue (it's a JavaScript related writeup, but don't let that confuse you, they are apparently facing the same precision problems in JS-land because they only have doubles natively :)).

http://2ality.com/2012/07/large-integers.html

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/lihaoyi/upickle/issues/263, or mute the thread https://github.com/notifications/unsubscribe-auth/AA5A_LVK7uqgwvYhqFxtkH5CodIaZYFJks5vLLTwgaJpZM4aqr6K .

Original Author: lihaoyi

amnaredo commented 3 years ago

closing due to inactivity Original Author: lihaoyi

amnaredo commented 3 years ago

I think this is still (upickle:1.2.2) the case. Even reading and rendering doesn't preserve numbers. Eg.

  val jsonStr = """
    {
      "Long.MaxValue": 9223372036854775807,
      "Long.MaxValue - 1": 9223372036854775806,
      "Long.MaxValue - 2": 9223372036854775805
    }
  """
  val json = ujson.read(jsonStr)
  println(json.render(2))

prints

{
  "Long.MaxValue": 9223372036854775807,
  "Long.MaxValue - 1": 9223372036854775807,
  "Long.MaxValue - 2": 9223372036854775807
}

which is different from the original input. Original Author: radekm

amnaredo commented 3 years ago

@radekm that behavior is expected. ujson.Value follows Javascript semantics, which behaves as shown on large numbers. You can use upickle.default.Read[Map[String, Long]] for your example, or use a different AST (https://www.lihaoyi.com/upickle/#OtherASTs) if you want a more precise AST structure (at the expense of performance and other things). Original Author: lihaoyi

amnaredo commented 3 years ago

But JavaScript is shit and we should do better, IMHO 😎 Original Author: johnnovak