Open gajus opened 2 years ago
Unquoted strings are tricky.
If you want /foo/i
to be matched by Nearley as a regex-like string and not unquoted string then your unquoted string grammar must contain something that will always fail on it. Like require that /
must always be escaped for example.
If that's fine then your unquoted string will be something like a sequence of characters, where each character is either \/
(or any escape sequence flavor you prefer) or anything but /
(and other characters you want to escape).
This requirement can be relaxed a bit: if something starts with /
then it is regex-like, otherwise it is unquoted string. But first character can't be /
so the provision for escape is still required and the grammar will be more complicated.
After you distinguish between regex-like and unquoted strings - unquoted strings might bite you somewhere else.
Nearley has no efficient means to discard alternatives when multiple interpretations are possible. Some discussion in #591
For a use case where I have to deal with unquoted strings and "take first successful match" logic works for alternative parsings - I just dropped Nearley and went with my own solution.
This appears to work fine:
regex ->
regex_body regex_flags {% d => d.join('') %}
regex_body ->
"/" regex_body_char:* "/" {% d => '/' + d[1].join('') + '/' %}
regex_body_char ->
[^\\] {% id %}
| "\\" [^\\] {% d => '\\' + d[1] %}
regex_flags ->
null |
[gmiyusd]:+ {% d => d[0].join('') %}
I need to match regex like strings, such as
/foo/i
and distinguish them from regular unquoted strings.