kdl-org / kdl

the kdl document language specifications
https://kdl.dev
Other
1.12k stars 62 forks source link

`identifier-char` grammar is invalid #345

Closed tabatkins closed 9 months ago

tabatkins commented 10 months ago

Currently the identifier-char grammar is:

identifier-char := unicode - linespace - [\/(){}<>;[]=,"]

The bit at the end is using a [...] character class, but it tries to include [] characters as well. These presumably need to be escaped somehow.

(It's also not clear to me, on inspection, whether the first \ is actually meant to be a \ char, or if it's escaping the following / char. This grammar isn't using an official grammar language, so it's not quite clear what the conventions are.)

zkat commented 10 months ago

You know what, I don't know why we ban \ at all. Let's just allow it and make Windows users happy.

zkat commented 10 months ago

and yeah, we should allow / too.

zkat commented 10 months ago

oh right. That's why we don't allow /. Comments.

I guess this means we need to add some special cases so /*, /- and // are not valid identifiers.

larsgw commented 10 months ago

Those should be invalid identifier prefixes in that case, /*a should still start a comment.

zkat commented 10 months ago

oh right, and \ is disallowed because of line continuations.

...I'm thinking that maybe we should just keep banning \ and / and just make it clearer to resolve this Issue, so it doesn't seem like we're using \ to escape / in the grammar.

and yeah, we should clarify []

larsgw commented 9 months ago

Right, I'm realizing now it's not even just prefixes, it's those sequences anywhere in the identifier.

zkat commented 9 months ago

It's too bad. I was really into the idea of supporting #path/to/foo and #C:\foo\bar but I just don't see a good way to do so without a lot of "will this parse as an identifier?" overhead. Unless I'm thinking about it too much

zkat commented 9 months ago

The fixes for this have been merged into the kdl-v2 branch