Closed JaredCE closed 5 years ago
Do they not appear in the output at all?
At the low-level tokenization level, escaped backslashes and curly braces are treated as other control symbols / words, so \{
, \par
, \ldblquote
, \someRandomThing
are all emitted as type CONTROL
, as technically they are, and it's up to the higher level stream (DeEncapsulate
as an example) to interpret these special control symbols / words as text.
It would be possible for the tokenizer to emit these special control words as text, but then it would need knowledge about the meanings / symantics of specific control words. I wanted to keep the tokenizer just focused on the syntax, so I left this symantic interpretation as the responsibility of a higher level.
I'm willing to listen to any arguments against this, however. If you're using just the tokenizer and not the de-encapsulater, it may be a bit too low-level... perhaps there is room for another layer in between that does more symantic interpretation but isn't focused solely on de-encapsulation.
Please let me know if you have an example of slashes or curly braces not coming out as text through the DeEncapsulate
layer.
further playing around with this library, it seems that you're not correctly treating escaped backslashes (
\
) and curly braces ({
and}
) correctly, they're not coming up as text, but either control or group types.