Stage: 1
Champion: Ron Buckton (@rbuckton)
For detailed status of this proposal see TODO, below.
The RegExp Extended mode is a feature commonly supported amongst multiple regular expression engines that makes it possible to write regular expressions that are easier to read and understand through the introduction of insignificant white space and comments.
See https://rbuckton.github.io/regexp-features/features/comments.html and https://rbuckton.github.io/regexp-features/features/line-comments.html for additional information.
x
)Prior Art: Perl, PCRE, Boost.Regex, .NET, Oniguruma, Hyperscan, ICU, Glib/GRegex (feature comparison)
The extended mode (x
) flag treats unescaped whitespace characters as insignificant, allowing for multi-line regular expressions. It also enables Line Comments.
NOTE: The
x
-mode flag can be used inside of a ModifierNOTE: While the
x
-mode flag can be used in a RegularExpressionLiteral, it does not permit the use of LineTerminator in RegularExpressonLiteral. For multi-line regular expressions you would need to use theRegExp
constructor.NOTE: Perl's original
x
-mode treated whitespace as insignificant anywhere within a pattern except for within character classes. Perl v5.26 introduced thexx
flag which also ignores non-escaped SPACE and TAB characters. Should we chose to adopt thex
-mode flag, we could opt to treat it as Perl'sxx
mode at the outset.
Prior Art: Perl, PCRE, Boost.Regex, .NET, Oniguruma, Hyperscan, ICU, Glib/GRegex (feature comparison)
An inline comment is a sequence of characters that is ignored by pattern matching and can be used to document a pattern.
(?#comment)
— The entire expression is removed from the pattern.
(
or )
characters (instead, they must be escaped)./
(unless it is escaped).[
and ]
as well, unless we are able to change the definition for RegularExpressionBody in the specification
so as to permit an unbalanced pair of [
and ]
within comment. See Issue #1.NOTE: This has no conflicts with existing syntax, as ECMAScript currently produces an error for this syntax in both
u
and non-u
modes.
Prior Art: Perl, PCRE, .NET, ICU, Glib/GRegex (feature comparison)
A Line Comment is a sequence of characters starting with #
and ending with \n
(or the end of the pattern) that is ignored by pattern matching and can be used to document a pattern.
# comment
— A line comment in a multi-line RegExpNOTE: Requires the
x
-mode flag.NOTE: Inside of
x
-mode, the#
character must be escaped (using\#
) outside of a character class.NOTE: Not supported in
x
mode in a Regular Expression Literal
const re = /(foo) (bar) (baz)/x;
re.test("foobarbaz"); // true
const re = /foo(?#comment)bar/;
re.test("foobar"); // true
const re = new RegExp(String.raw`
# match ASCII alpha-numerics
[a-zA-Z0-9]
`, "x");
x
)RegExp.prototype.extended
(Boolean) — Indicates the x
-mode flag is set.The following is a high-level list of tasks to progress through each stage of the TC39 proposal process: