Open tajmone opened 3 years ago
the line code for the escape character is forced to contain a trailing space
Well spotted!
The reason is that the current parser uses a regex that does not consider this edge-case.
The new pXML parser (which only reads a sequence of characters (no regexes)) will parse [c \\]
correctly as a node c
with content \
.
However, it's a very good idea to add 'word_joiner' and 'empty' nodes. They can help to explicitly eliminate ambiguities like this, and they are useful in other cases as well, as you mentioned. Will be done. Easy to implement.
I personally prefer
[empty
to[blank
, for I believe it's clearer, and I'd avoid having having both, since it's redundant.
I agree.
the line code for the escape character is forced to contain a trailing space (\ ) — in the source file
This bug has been fixed in version 2.0.0
I've noticed that in the PML User Manual, section Anatomy of a PML Document » Attributes, the line code for the escape character is forced to contain a trailing space (
\
) — in the source file05_anatomy.pml
:The problem here is that using
[c \\]
instead of[c \\ ]
won't work because it would be parsed as[c
+\
+\]
, i.e. the second slash is being interpreted as escaping the closing bracket.To avoid similar problems (which are typical edge cases found on all lightweight syntaxes) I suggest adding some extra special characters:
[empty
or[blank
— replaced by nothing (empty string), post-parsing. It's sole role is to feed a token separator to the parser.[wj
— word-joiner character (⁠
); a code point in Unicode that prevents a line break at its position.(obviously, no closing bracket required for either)
The above example from the PML User Manual could then be fixed via:
Both of these are useful hacks to handle edge-cases where the PML parser could be faced with ambiguities like the above example, and they would be the equivalents of Asciidoctor's predefined characters-substitutions attributes
{empty}
/{blank}
and{wj}
, which are extremely useful to handle all sort of edge-cases in AsciiDoc sources.In Asciidoctor,
{empty}
and{blank}
are identical, one is just an alias of the other; I personally prefer[empty
to[blank
, for I believe it's clearer, and I'd avoid having having both, since it's redundant.The
[wj
is also very useful in situations where you need to prevent the browser from wrapping a table column during auto-adjustment (e.g. because one column contains words separated by boundaries like spaces, hyphens, brackets, etc.). Or to prevent wrapping a line between a word and its footnote marker, e.g.someword[1]
→someword
+\n
+[1]
, whereassomeword[wj[1]
and sometimes they can just improve source readability
These would be consistent with the current
[nl
and[sp
substitutions available in PML.References