stil4m / elm-syntax

Elm syntax in Elm
MIT License
93 stars 28 forks source link

Store raw string for numbers in the AST #108

Open jfmengels opened 3 years ago

jfmengels commented 3 years ago

Currently, if we try to parse the following code

a = 100000000000000000000

we'll get an AST with the following node

Integer 9.999999999999999e+29

It's not much, but there is a precision loss. If we were to write that number using the writer, we would not get the same output as the input.

My thinking is that we could keep the original "text" in the same node, like

Integer 9.999999999999999e+29 "100000000000000000000"

and then the writer could re-use that original text from the AST. elm-review rules could also use this information for other reasons, though I can't think of any.

Downside: It's possible to have different values represented by a and b in Integer a b (though that is also true currently in a way...), but now it's even easier because you can do Integer 1 "2", or even Integer 1 "".

I think the Floatable variant should use the same approach.

I'm thinking we could remove the Hex variant in favor of Integer Int String. I think in most cases (at least in elm-review rules), we don't care how an integer is written but we care about the value more. If we care about integers written in hex, it's reasonably easy to check whether the string starts with 0x.

If Elm ever adds support for writing numbers like 10_000, then we will need to update the parser but we won't need to change the Expression type.

(This was inspired by issue https://github.com/avh4/elm-format/issues/690, where the problem seems to appear for smaller numbers than this one.)

MartinSStewart commented 3 years ago

Will a user want to do math with the Integer or Floatable value? If it contains a string instead of an Int or Float then we've left it up to them to parse the string and they might forget to handle edge cases like hex values. Maybe we could instead have an opaque type that contains the number and helper functions like isHex: IntegerValue -> Bool, toInt : IntegerValue -> Maybe Int, toString : IntegerValue -> String, etc?

jfmengels commented 1 year ago

I don't think people will do math on the expressions no, but I think people will potentially want to pattern match on the numbers Integer 1 _ ->, which is not possible with an opaque type.