kdl-org / kdl

the kdl document language specifications
https://kdl.dev
Other
1.14k stars 62 forks source link

Clarification on numeric literals #170

Closed benjreinhart closed 3 years ago

benjreinhart commented 3 years ago

Hello... I am looking for some clarification around a couple things related to numbers.

(Apologize in advance if this is obvious to everyone, I am new to implementing parsers and interpreting language specs).

  1. For numbers that specify a radix, the radix MUST be followed by a respective digit, correct? For example, an underscore cannot immediately follow 0b, 0o, and/or 0x, right (e.g., 0b_01 is illegal)?
  2. If there was a trailing underscore, the lexer would NOT include that in the number literal, but instead probably treat that as the start of an identifier (which I think is then illegal from the parser's POV), e.g., 0b01_?
  3. Only a single underscore is valid in between digits, so 123___456_789 would be illegal?
  4. In decimal numbers, can an underscore appear after an exponent denoted with e or E?

If folks think it would be helpful to make any of the above worded more explicitly in the spec, I'd be happy to take a stab at that.

🙏🏻

zkat commented 3 years ago

Spec ref:

decimal := integer ('.' [0-9] [0-9_]*)? exponent?
exponent := ('e' | 'E') integer
integer := sign? [0-9] [0-9_]*
sign := '+' | '-'

hex := sign? '0x' hex-digit (hex-digit | '_')*
octal := sign? '0o' [0-7] [0-7_]*
binary := sign? '0b' ('0' | '1') ('0' | '1' | '_')*

Translation:

  1. Correct, that is illegal. ('0x' hex-digit (hex-digit | '_')*)
  2. Trailing underscores are a valid part of numbers
  3. You can have multiple underscores.
  4. Yes, after the first post-E digit (see definition for exponent above)