pml-lang / pml-companion

Java source code of the 'PML Companion (PMLC)'
https://www.pml-lang.dev
GNU General Public License v2.0
22 stars 1 forks source link

Parser Inconsistency with Quoted IDs #96

Closed tajmone closed 1 year ago

tajmone commented 1 year ago
PMLC 3.1.0 | Win 10

According to documentation an ID must abide to the RE pattern:

[a-zA-Z_][a-zA-Z0-9_\.-]*

but if you try to compile the following document:

[doc [title ( id = "valid-ID  " ) Quoted ID]]

the actual ID in the generated HTML includes the trailing spaces:

<h1 id="valid-ID  " class="pml-doc-title">Quoted ID</h1>

PMLC should either (1) emit an error due to invalid ID definition, or (2) strip away the extra spaces from the final ID. IMO solution 1 is better because it enforces stricter documents.

Side Note

I've come across this inconsistency while implementing quoted IDs in Sublime PML — originally I didn't add support for them, because I though no one would use them since IDs can't contain spaces, but then I noticed that the PML documentation frequently quotes IDs so I started to work on this feature.

Quoted IDs add various extra layers of complexity to editor syntaxes, because of the many edge cases that can encountered — e.g. if there isn't a closing quote on the same line, the syntax needs to gracefully recover so it doesn't break the whole document; along with other cases such as a valid ID followed by spaces, etc.

Whenever I work on a new syntax feature I always carry out some practical experiments with PMLC, just to check how the parser handles edge cases, so that I can find the right balance between how far Sublime PML can support edge cases and how PMLC handles them in real document conversion. This is how I came across this parser bug, i.e. by testing for edge cases handling.

I was quite surprised that such a parsing inconsistency/bug could have slipped by unnoticed, I would have expected the PMLC repository to have a test suite to cover IDs edge cases. Without a solid test suite it's going to be hard to ensure a linear and bug-proof growth of PML — even with Sublime PML, which is just an editor syntax, I wouldn't be able to work on it without covering each feature with extensive tests, to ensure that new features don't break previous ones.

Even though I'm not a Java expert, I know that Java has probably more test libraries than any other language out there, so it shouldn't be hard to add a solid test suite to PMLC.

tajmone commented 1 year ago

Actually, it seems that spaces within an ID are allowed too:

[doc [title ( id = "va lid-ID" ) Quoted ID]]

the above converts without error, and the final ID preserves the space inside the ID.

pdml-lang commented 1 year ago

PMLC should either (1) emit an error due to invalid ID definition, or (2) strip away the extra spaces from the final ID. IMO solution 1 is better because it enforces stricter documents.

Yes, solution 1 will be applied in the next version.

Thanks for reporting this bug.

pml-lang commented 1 year ago

PMLC should emit an error due to invalid ID definition

Fixed in version 4.0.0