tajmone / ST4-Asciidoctor

AsciiDoc Package for SublimeText 4
https://tajmone.github.io/ST4-Asciidoctor
MIT License
11 stars 6 forks source link

RE: Eliminate Negative-Lookbehind for Leading Escapes #52

Open tajmone opened 6 months ago

tajmone commented 6 months ago

Many of the bad RegExs reported when running "Syntax Test - Regex Compatibility" (i.e. incompatible with the new ST RegEx engine, due to the presence of look-behind statements) seem to perform checks for the presence of a leading escape character.

The negative look-behind can safely be removed by such RegExs, provided that the syntax ensures that the context that handles escape sequences is always executed before them — i.e. we're positively sure that no unattended escape sequences can slip through and be missed by those contexts that currently rely on a negative look-behind for an escape backslash ((?<!\\)).

In order to achieve this, we need to first disentangle the order of execution of these contexts, to ensure that the escapes are always handled before they are attempted — in this respect, we might even consider moving the context that handles escape sequences into the prototype context, as long as we remember to manually disable prototypes in those contexts where escaping doesn't apply.

tajmone commented 6 months ago

Arguments Against Using prototype

Although adding the context that handles backslash escaping to the prototype context sound a tempting quick-solution, there are some considerations to be taken into account...

  1. AsciiDoc also needs to account for "double escaping" — e.g. \\__func__) to escape the two underscores after the backslashes, to ensure that neither of them are treated as formatting delimiters.
  2. The way preprocessor directives are escaped is a bit more complicated: a single \ at the beginning of the line will ensure that the entire directive is ignored and treated as raw text — our syntax right now doesn't skip the entire directive, with the result that some part of it might result in a false positive match (e.g. the square brackets treated as a macro).

Chances are that, in order to ensure that the context that handles escape sequences will develop into a "smart context" that can account for the above, it might not play well with being included into prototype.

It might still be possible to split the handling of backslash escapes across different contexts, in order to meet all the mentioned goals, but it could turn out to be trickier that it might seem at first thought ... In any case, it's something that needs to be considered thoroughly, and covered by extensive syntax tests.


References