I recently created the regex package, which is a template tag for regexes as raw strings. Among other features, it always implicitly enables flag x based on details in this proposal (which are fairly limited right now, so I also based it on flag xx from Perl and PCRE).
It always uses flag v (if available) or u implicitly, so it can't be used to test x in Unicode-unaware mode. And since it uses template strings, it doesn't need to worry about ()[]/ in comments (since comments are stripped before passing to the RegExp constructor). But with those caveats, you can use it to test x behavior even for edge cases.
Just for example:
regex`[a- -b]` is an error (range of a to unescaped/invalid -).
regex`[a& &b]` is equivalent to /[a&b]/.
regex`[a && b]` is equivalent to /[a&&b]/v.
regex`\0 1` is equivalent to /\0(?:)1/.
\c A and (? :) are errors, and ( ?:) is an error because you can't quantify (.
Quantifiers following whitespace and/or comments apply to the preceding token, so x + is equivalent to x+.
Whitespace and/or comments are allowed to separate a quantifier and the ? that makes it lazy.
Only space and tab are insignificant within character classes and [\q{…}], not # or other whitespace.
Outside of character classes, the insignificant whitespace characters are those matched natively by \s.
Excluding [\q{…}], whitespace is significant in enclosed tokens.
Outside of character classes: \u{…}, \p{…}, \P{…}, (?<…>), \k{…}, and {…}.
Within character classes: \u{…}, \p{…}, and \P{…}.
If additional details are clarified in this proposal and they don't match regex's handling, I will update it to stay in line.
I recently created the
regex
package, which is a template tag for regexes as raw strings. Among other features, it always implicitly enables flagx
based on details in this proposal (which are fairly limited right now, so I also based it on flagxx
from Perl and PCRE).It always uses flag
v
(if available) oru
implicitly, so it can't be used to testx
in Unicode-unaware mode. And since it uses template strings, it doesn't need to worry about()[]/
in comments (since comments are stripped before passing to theRegExp
constructor). But with those caveats, you can use it to testx
behavior even for edge cases.Just for example:
regex`[a- -b]`
is an error (range ofa
to unescaped/invalid-
).regex`[a& &b]`
is equivalent to/[a&b]/
.regex`[a && b]`
is equivalent to/[a&&b]/v
.regex`\0 1`
is equivalent to/\0(?:)1/
.\c A
and(? :)
are errors, and( ?:)
is an error because you can't quantify(
.x +
is equivalent tox+
.?
that makes it lazy.[\q{…}]
, not#
or other whitespace.\s
.[\q{…}]
, whitespace is significant in enclosed tokens.\u{…}
,\p{…}
,\P{…}
,(?<…>)
,\k{…}
, and{…}
.\u{…}
,\p{…}
, and\P{…}
.If additional details are clarified in this proposal and they don't match
regex
's handling, I will update it to stay in line.