kdl-org / kdl

the kdl document language specifications
https://kdl.dev
Other
1.12k stars 62 forks source link

KDL 2.0: prefix keywords with `#` #343

Closed zkat closed 10 months ago

zkat commented 10 months ago

I propose we remove the true, false, and null tokens, ban # from identifiers, and then add #true, #false, and #null to replace them.

This would essentially remove any ambiguity issues from implementing #339.

It also creates a handy namespace for future keywords, if there's ever a KDL 3.0 or 2.x draft.

dezren39 commented 10 months ago

initially this felt bad, but i came around to this. namespaces are always handy. i can already see some peoples first introduction to kdl being a config file that happens to be filled with a ton of "#true #false" in a row and being like 'wtf?' but that is probably ok. would this be ban # from identifiers or add it to the 'not allowed as the first character' list? image edit/add: asking mostly for clarity, i could see either way. can always use double quotes if "widget#1" is needed as a name i think, so blocking out altogether also makes sense.

zkat commented 10 months ago

My intention was to ban # altogether.

tabatkins commented 10 months ago

+1 from me on the initial suggestion of the # prefix. Having zero chance of ambiguity is quite nice, and it helps avoid potential typos (they'd become invalid keywords, making them catchable by a parser or validator).

We're already banning specifically the r# prefix in idents (to avoid an ambiguity with raw strings), and banning a # prefix entirely would be in line with that, and at that point the exceptions are definitely approaching the line where it's worthwhile to instead simplify the restriction and just ban # entirely.

The current set of disallowed chars is \/(){}<>;[]=," - basically "the grouping characters", "the characters used by other KDL syntax", and then a comma for some reason. Adding # would fall nicely into the "used by other KDL syntax" set.

zkat commented 10 months ago

note: after sleeping on this for a few days (and asking about it on social media), I'm not sure I want this after all. It's a really big breaking change and I'm not sure it's worth it. It's also kinda... weird? Maybe it would've been the right decision if we were starting KDL from scratch, but now?

larsgw commented 10 months ago

It's a bit weird but I think I still prefer to distinguish keywords from strings syntactically. So if we remove the specific syntax for (some) strings, I think it makes sense to introduce specific syntax for keywords.

zkat commented 10 months ago

random thought: does this mean that we could also remove the r prefix from r# raw strings, and just make raw strings be #"foo"?

MultisampledNight commented 10 months ago

This makes me think a bit of typst, where there's basically 2 (or 3 if one counts math) languages in one. The default one is markup, where text mostly appears as it is written, with some syntax sugar for highlighting, lists and the like.

However, using #, one can switch to code mode: The block or statement following the # is interpreted in a custom scripting language.

For typst, that decision makes sense since most of the actual document is written in markup (and one can switch inside a code mode block temporarily back to markup mode using [] brackets).

For kdl, I'm not entirely sure. There's no full-blown scripting language behind the scenes, but I believe the basic reasoning works the same as it does for typst: Most documents written in kdl probably aren't just accumulations of true/false/null switches, but perhaps more expressive.

larsgw commented 10 months ago

(hey, typst mention, nice!)

tabatkins commented 10 months ago

random thought: does this mean that we could also remove the r prefix

That seems to be an independent decision, yeah? The fact that #-strings and raw-strings are tied together right now actually feels slightly weird - #-strings let you safely use quotes in your string, while raw strings let you safely use \ in your string.

zkat commented 10 months ago

Withdrawing this proposal because I hate it, actually.