Bunny83 / SimpleJSON

A simple JSON parser in C#
MIT License
735 stars 294 forks source link

Bug in parsing long? #16

Closed wvdvegt closed 6 years ago

wvdvegt commented 6 years ago

Hi

In ParseElement() the following code is used

if (double.TryParse(token, NumberStyles.Float, CultureInfo.InvariantCulture, out val))
    return val;
else
    return token;

However all:

Double.TryParse(Int64.MaxValue.ToString(),out Double d)
(Double)Int64.MaxValue
Int64.MaxValue == Double.Parse(Int64.MaxValue.ToString())

happily return true and a value 9.2233720368547758E+18 Where Int64.MaxValue is 9223372036854775807 (very close but incorrect and 7 off).

The new longAsString does not seem to help here as the conversion is already performed.

Not sure if this is solvable (other then keeping numbers as strings until they are requested with an As** Method). Maybe for plain numbers, checking presence of the decimal separator and the length in characters might work.

wvdvegt commented 6 years ago

Maybe checking Int64.TryParse() BEFORE using Double.TryParse() might do the trick.

Checking presence of decimal separator/exponent might (and skip the Int64.TryParse()) might improve speed a bit.

david-pfx commented 6 years ago

As I commented in another issue, double only round-trips about 17 decimal digits (exactly how many varies by the value). Getting wrong numbers is not acceptable. Better to return just the string and leave it to the caller to do the conversion. Or raise an exception if it's out of range.

wvdvegt commented 6 years ago

Some context:

I need it for a swagger definition that has some long ID's in it. Not being able to read the retrieve the correct and exact value or get an exception is not an option either (besides that it slows down a lot).

I tried the Int64.TryParse and it correctly returns false when the numeric string (for example a string containing Int64.MaxValue+1) cannot be converted. I was very surprised Double.TryParse(Int64.MaxValue) returned true (which is an error).

So:

  1. ) Check on the smaller type with bigger precision using Int64.TryParse().
  2. ) Then checking then on if the converted value is inside Int32.Max/MinValue range, if so Double can be used for storage, else store the Int64 values as a string.
  3. ) If Int64.TryParse() fails, use Double.

In addition one could pre-check the length of the string and presence of . and e or E to see if it's a floating point. But I'm not sure it's worth the effort.

FYI SimpleJSON works like charm and code is both compact, portable and fast (making me a very happy user). It frees me of coding numerous model classes to catch the output.

One feature request: Please add .AsString as alias for .Value (for symmetry purposes).

wvdvegt commented 6 years ago

Adding

            long lval;
            if (longAsString && long.TryParse(token, out lval))
                return lval;

just before the

            double rval;
            if (double.TryParse(token, NumberStyles.Float, CultureInfo.InvariantCulture, out rval))
                return rval;
            else
                return token;

in ParseElement seems to work ok. If longAsString is false, trying to get a long has no use as the result is stored in a double and truncated in precision again.

david-pfx commented 6 years ago

Correction: the limit on round-tripping a double is 15 decimal digits: (log10(2^53-1)).

Bunny83 commented 6 years ago

There is no "bug" as JSON does not have an explicit definition for any particular binary representation of a number. As you can read in the RFC7159 the double precision floating point format is the one that is widely used and you should not expect any parser to provide better precision than double. Note that Javascript itself represents a number as double.

The implicit / explicit long conversion is just for the ease of use and was never meant to change how the json is parsed.

I would not recommend adding the long.TryParse before the double.TryParse as this would turn any whole number into a string. So doing a read-modify-write on some json text could change the actual data type from number to string.

Any sorts of IDs, hashes or keys should be represented as string inside the json itself and not as number to avoid any compatibility issues. A JSONString can be implicitly converted to long in case you need the actual value as long.

The current implementation will not be changed as it behaves as everyone would expect. Feel free to add a "JSONLong" or "JSONBigInt" to your copy if you think you need it ^^.

Yes I already thought of either renaming Value or adding an AsString property.

wvdvegt commented 6 years ago

I agree, a simple solution for everyone is not there. I'm still a very happy user of SimpleJSON.

FYI Showed the issue to a colleague of mine and he could not resists in checking it in Java and Python, both have a 4 digit loss!

In Java: System.out.println(Double.parseDouble("" + Long.MAX_VALUE)); System.out.println("" + Long.MAX_VALUE); print: 9.223372036854776E18 9223372036854775807

In 64-bit Python 3.6: float(sys.maxsize) int(sys.maxsize) print: 9.223372036854776e+18 9223372036854775807