Closed ghost closed 7 years ago
Note that the syntax in 11.8.5 (Regular Expression Literals is just used for recognisinig the end of a regular expression literals /.../
in ES code, and not for analysing it. At first glance, it should not be changed, because \p
and \P
are properly handled, and {
and }
do not play special role in that grammar.
The grammar that should be augmented is the one of 21.2.1 Patterns, so that \pX
and \p{...}
are recognised as Atom. A first sketch:
Atom[U] :: \ AtomEscape[?U] etc. AtomEscape[U] :: DecimalEscape <--- this is for \123 CharacterEscape[?U] <--- this is for \n, \uXXXX, etc. (a designated character) CharacterClassEscape[?U] <--- this is for \d, \s, etc. (a class of characters) CharacterClassEscape[U] :: d D s S w W [+U] p someLetter [+U] p { someSequence } [+U] P someLetter [+U] P { someSequence }
where someLetter and someSequence remain to be determined.
FWIW, ES4 did it very similarly: http://wiki.ecmascript.org/lib/exe/fetch.php?id=spec%3Aspec&cache=cache&media=spec:library-d2.html#RegExp%20grammar
CharacterClassEscape :: d => charset_digit D => CharsetComplement( charset_digit ) s => charset_space S => CharsetComplement( charset_space ) w => charset_word W => CharsetComplement( charset_word ) p { UnicodeClass } => unicodeClass( UnicodeClass ) P { UnicodeClass } => CharsetComplement( unicodeClass( UnicodeClass ) )
First draft: https://github.com/tc39/ecma262/compare/master...mathiasbynens:unicodePropertyEscape?expand=1
I’m mostly looking for feedback on whether I’m writing the spec correctly, not in terms of functionality / features.
Update: Made it into a PR so it’s easier to leave comments on specific lines: https://github.com/mathiasbynens/ecma262/pull/1/files