Open RyanCavanaugh opened 7 months ago
I expect all of the errors not related to --target
are a result of regular expressions that are allowed per Annex B.
IMO, all of the "Octal escape sequences are not allowed" and "A decimal escape must refer to an existent capturing group" are probably indications of actual errors in user code. They're allowed in Annex B, but the user likely intended to use them as a backreference to a capture group and that's not how Annex B would treat them.
All of the "A character class range must not be bounded by another character class" errors are probably fine and shouldn't be reported. Annex B allows them and most users wrote something like [\w-.]
or the like thinking it meant "word characters, -
, and .
", which is how Annex B treats them.
Ah, I did once mentioned this on my PR and thought it was fine since Ryan reacted on my comment. https://github.com/microsoft/TypeScript/pull/55600#issuecomment-1735102411 I am fine with weakening the grammar, however keep in mind that we can’t guarantee everything runs on engines with Annex B support though I understand that this is mostly the case. IMHO another compiler option is the only realistic way to solve this, unfortunately.
IMO, all of the "Octal escape sequences are not allowed" and "A decimal escape must refer to an existent capturing group" are probably indications of actual errors in user code.
Yes, I actually thought that there is a consensus on not allowing any octal escapes anywhere per #53198 😅
Great work on adding validation for regexp!
We came across another regression on 5.5 for character class escape with script extensions that I did not see listed abover:
const regexpNonLatin = /\P{Script_Extensions=Latin}+/gu;
Unknown Unicode property value.
The issue seems specific to Script_Extensions
and scx
- Script
is working fine. Same behavior is observed for \p
and \P
.
"٢".match(/\p{Script=Thaana}/u); // OK on 5.5
"٢".match(/\p{Script_Extensions=Thaana}/u); // KO on 5.5
// @ts-ignore
can be used to work around the error, as hinted on https://github.com/microsoft/TypeScript/pull/58295
Those regexps are part of the samples on https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Regular_expressions/Unicode_character_class_escape ; we use something similar in our codebase and faced this when pretesting our typescript upgrade.
Would it be possible to support script extension values in 5.5?
Related links:
@nostalic OMG, that’s totally my fault, I am very bad. I made it empty because the Script_Extensions
section in PropertyValueAliases.txt shows nothing, without thinking much.
However, I don’t think the Team will have time to review PRs related to regular expressions in the immediate future; they even haven’t reviewed my short follow-up PRs yet 😅
@graphemecluster This is a great improvement, and the regex validation helps to catch some issues we had, so thanks for implementing it!
The issue can be worked around and as such this is not a blocker for us, though it would be great to have it fixed in 5.5 🙂
@nostalic OMG, that’s totally my fault, I am very bad. I made it empty because the
Script_Extensions
section in PropertyValueAliases.txt shows nothing, without thinking much. However, I don’t think the Team will have time to review PRs related to regular expressions in the immediate future; they even haven’t reviewed my short follow-up PRs yet 😅
Please do send things if you have them; I do think we want to get things looked at before 5.5 is branched off.
thoughts on this: since we already do regex group checking (as per release notes) shouldnt the resulting matchgroups be typed ?
(tried on playground with 5.5-beta)
No, the type system does not special case regexes like this. (yet?)
Enabling further implementation of regex type checking is the most vital reason why I implemented regex syntax checking, and it’s gonna be the most exciting part 😆
Acknowledgement
Comment
Note: I eventually gave up on capturing "Not available unless target is ESXXXX" errors since they're not really interesting to look at
Via #58275
This character cannot be escaped in a regular expression.
Named capturing groups are only available when targeting 'ES2018' or later
Named capturing groups are only available when targeting 'ES2018' or later.
Named capturing groups are only available when targeting 'ES2018' or later
This regular expression flag is only available when targeting 'es2018' or later
This character cannot be escaped in a regular expression
Named capturing groups are only available when targeting 'ES2018' or later
Octal escape sequences are not allowed. Use the syntax '\x04'
Named capturing groups are only available when targeting 'ES2018' or later
Named capturing groups are only available when targeting 'ES2018' or later
A character class range must not be bounded by another character class
Octal escape sequences are not allowed. Use the syntax '\x02'.
This character cannot be escaped in a regular expression.
Octal escape sequences are not allowed. Use the syntax '\x02'
A character class range must not be bounded by another character class
This regular expression flag is only available when targeting 'es2022' or later
This regular expression flag is only available when targeting 'es2018' or later
This regular expression flag is only available when targeting 'es6' or later
Octal escape sequences are not allowed. Use the syntax '\x00'
This regular expression flag is only available when targeting 'es2018' or later
A character class range must not be bounded by another character class
This character cannot be escaped in a regular expression
This regular expression flag is only available when targeting 'es6' or later
There is nothing available for repetition
[ '}' expected]()
A character class range must not be bounded by another character class
A character class range must not be bounded by another character class.
A character class range must not be bounded by another character class
This character cannot be escaped in a regular expression.
This regular expression flag is only available when targeting 'es2022' or later.
Octal escape sequences are not allowed. Use the syntax '\x09'.
A character class range must not be bounded by another character class.
A character class range must not be bounded by another character class.
Named capturing groups are only available when targeting 'ES2018' or later
A character class range must not be bounded by another character class.
This character cannot be escaped in a regular expression
A character class range must not be bounded by another character class
A character class range must not be bounded by another character class
dozens of these in this file, see https://github.com/microsoft/TypeScript/issues/58275#issuecomment-2068174097
A decimal escape must refer to an existent capturing group. There are only 1 capturing groups in this regular expression
A character class range must not be bounded by another character class
Unicode property value expressions are only available when the Unicode (u) flag or the Unicode Sets (v) flag is set
A character class range must not be bounded by another character class