Closed dmolesUC closed 3 years ago
My workaround for now is to disallow }
in comparison_string
values, but it would be nice not to have to do that. I looked at the balanced parentheses example, but that's not quite what I'm looking for.
Coming back to this I realized that for this particular use case there is an escape mechanism that's not captured in the simple grammar — }
and several other characters can only appear in the body of comparisonString
if escaped by \
. New solution:
# ASCII visible characters, except those that need to be escaped
rule(:vchar_cs) { match['\u0021-\u007e&&[^!$=?{|}~]'] }
# ASCII visible characters that need to be escaped
rule(:vchar_cs_esc) { match['!$=?{|}~'] }
rule(:comparison_string) do
(vchar_cs | vchar_cs_esc) >>
(vchar_cs | (str('\\').ignore >> vchar_cs_esc)).repeat
end
rule(:_comparison_string) { str('\\').ignore >> comparison_string.as(:value) }
I'm trying to write a parser for the MARCSpec grammar, and I'm running into trouble with this bit:
Relevant parts of my code:
I can parse a
subTermSet
fine, but not asubSpec
, e.g.\A
but not{\A}
. I think what's happening is that my parser forcomparisonString
sees the trailing}
, not unreasonably, as part of its own value, so thesubSpec
parser runs out of characters. If I simplify mysubSpec
rule down to:I still get this failure:
Is there any way to get around this, or am I running up against some inherent PEG parser limitation?