kdl-org / kdl

the kdl document language specifications
https://kdl.dev
Other
1.1k stars 61 forks source link

Disallowing "r#" at the beginning of identifiers #224

Closed zkat closed 9 months ago

zkat commented 2 years ago

Discussed in https://github.com/kdl-org/kdl/discussions/200

Originally posted by **garrisonhh** September 23, 2021 As far as I can tell from the spec, `r#####` is a perfectly valid identifier. At the moment, given a sequence of characters representing a single kdl value or identifer, all types of values can be identified solely based upon their first two characters except for raw strings because of this particular case. For consistency, I'd also be in favor of disallowing `#` altogether from identifiers. This is a tiny nitpick, but it makes sense to me from an implementer perspective!
tabatkins commented 2 years ago

So I suggested this specifically, and like it better than disallowing # entirely from idents; I like KDL's generally quite wide-open ident grammar.

Having given it thought overnight, I think this can be understood as the same as the current prohibition of "looking like a number" that prevents you from starting an ident with 0 or +0, but allows signs and digits anywhere else in the ident. Similarly, idents would be prevented from "looking like a raw string".

This would also satisfy @garrisonhh's suggested invariant that all token types can be identified in their first two characters.

tabatkins commented 2 years ago

(#241 fixes this; I name-dropped the discussion rather than this issue in the PR.)

zkat commented 9 months ago

This has been fixed because raw strings now start with a plain #", without the re