json5 / json5-spec

The JSON5 Data Interchange Format
https://spec.json5.org
MIT License
48 stars 11 forks source link

Clarify meaning of hexadecimal values #6

Open mcraiha opened 5 years ago

mcraiha commented 5 years ago

So my current problem is definition of hexadecimal number system. The specs say:

Hexadecimal numbers contain the literal characters 0x or 0X that may be prefixed with an optional plus or minus sign, which must be followed by one or more hexadecimal digits.

So if I have the string value

positiveHex: 0xFFDECAFFEEDDEEDD

does that mean it stores

or maybe something else?

jordanbtucker commented 5 years ago

Thanks for the question. It means 18437397125074382557. Do you have any recommendations on how to reword the spec to be clearer?

mcraiha commented 5 years ago

I do not have a good rewording, since explaining it exactly would require definition of endianness and maybe a formula for converting the values between formats. Maybe I would use example section with comments to tell what the hex value should, e.g.

{
    // 0xdecaf as hex is equal to 912559 as integer
    positiveHex: 0xdecaf,
    // 0xC0FFEE as hex is equal to 12648430 as integer, so -0xC0FFEE equals -12648430
    negativeHex: -0xC0FFEE,
}
jordanbtucker commented 5 years ago

I was going to say that hex literals should be interpreted as they are in the ES5 spec, but the ES5 spec isn't much help. Here's a snippet from what it has to say about hex literals.

  • The MV of HexIntegerLiteral :: 0x HexDigits is the MV of HexDigits.
  • The MV of HexIntegerLiteral :: 0X HexDigits is the MV of HexDigits.
  • The MV of HexDigits :: HexDigit is the MV of HexDigit.
  • The MV of HexDigits :: HexDigits HexDigit is (the MV of HexDigits × 16) plus the MV of HexDigit.

Technically it defines how hex literals should be interpreted, but it's not in clear language.

mcraiha commented 5 years ago

ParseInt tells how the hex value should be converted to int https://tc39.github.io/ecma262/#sec-parseint-string-radix

I think the step 13. is what I am looking for, but wording in that is too "specsy".

mindplay-dk commented 4 years ago

Aren't platform concerns outside the scope of the specification?

I don't think JSON stipulates any details about platform constraints?

Platform constraints are going to vary among platforms (duh 🙄) so I don't think it makes sense for a standard that describes a transport encoding to stipulate any details about how numbers should be stored or represented outside of the file, e.g. in memory, in a database, etc.?

AFAIK, JSON only defines the number type as an unlimited series of digits:

In that sense, the JSON standard has the same "problem", if somebody wants to store, say, 99999999999999999999999999999999999999999999999999999999999999999999 as a number, right?

jordanbtucker commented 4 years ago

Platform constraints are going to vary among platforms (duh 🙄) so I don't think it makes sense for a standard that describes a transport encoding to stipulate any details about how numbers should be stored or represented outside of the file, e.g. in memory, in a database, etc.?

I don't think the representation or storage of numbers outside the JSON5 document is in question here. What's being proposed is that we specify what a hexadecimal number means.

The JSON specification explicitly states that numbers are base 10. So all implementations can agree that 3.14 means three and fourteen hundredths in base 10. This also means that one implementation won't see 11 as the base 16 value that represents 17 in base 10.

As far as hexadecimal values, I think that specifying that they are base 16 is enough. There is no ambiguity as to how a base 16 value converts to base 10, but that's not the only use case for hexadecimal values. It's a missed opportunity in parsers, including the reference parser, to convert hex values to decimal without giving the ability to retrieve the original representation. (I.e. 0x0001 is distinct from 0x1, which is also distinct from 1.0 even though they all represent that same decimal value.)

So I don't think it matters what 0xFFDECAFFEEDDEEDD converts to in base 10. What matters is that 0xFFDECAFFEEDDEEDD is distinct from any base 10 value, even if it can represent that base 10 value.