umijs / umi

A framework in react community ✨
https://umijs.org
MIT License
15.42k stars 2.66k forks source link

[Feature Request] Support RegExp literals with the `u` flag #8704

Closed tonywu6 closed 2 years ago

tonywu6 commented 2 years ago

Background

ES2018 introduces Unicode Property Escapes. This makes it possible to write regular expressions matching characters with specific Unicode properties, such as scripts and character classes, which is very powerful. For example, the RegExp [\p{Script=Han}\p{Script=Latin}] matches all current and future Latin characters (ASCII, Latin-1, etc) and Han characters (hanzi, kanji, hanja):

/^[\p{Script=Han}\p{Script=Latin}]+$/gu.test('Appleseed')
// true
/^[\p{Script=Han}\p{Script=Latin}]+$/gu.test('Aufklärung')
// true
/^[\p{Script=Han}\p{Script=Latin}]+$/gu.test('中国智造惠及全球')
// true
/^[\p{Script=Han}\p{Script=Latin}]+$/gu.test('国会議事堂前駅')
// true

This has a browser support of 94.77% globally, according to Can I Use.

To use this feature, the RegExp must be compiled with the new u flag. umi currently does not support using this flag in RegExp literals. Attempting to introduce this flag results in a compilation error, see example.

Module build failed (from ./node_modules/@umijs/bundler-webpack/compiled/babel-loader/index.js):
TypeError: ....tsx: symbol.charCodeAt is not a function
    at symbolToCodePoint (node_modules/@umijs/bundler-utils/compiled/babel/index.js:121445:22)

Enabling the u flag by invoking the RegExp constructor (new RegExp('...', 'u')) behaves as expected, however.

Proposal

Enable this kind of RegExp by default by enabling @babel/plugin-proposal-unicode-property-regex (which I think is enabled automatically if the target is ES2018) without requiring users to do it manually using extraBabelPlugins.

YdreamW commented 2 years ago

This problem is caused by our precompiled Babel . The rewrite-pattern.js under the package regexpu that plugin @babel/plugin-proposal-unicode-property-regex depends on dynamicly requires packages under the package regenerate-unicode-properties. And these packages depend on the regenerate package which has been precompiled in our Babel. As a result, the references of 'required' regenerate and 'precompiled' regenerate are different.

One solution to this problem is to exclude the package regenerate when precompiling 'Babel'. @PeachScript