Replace hard-coded parser unicode identifier detection with RegExp

knockout / tko

🥊 Technical Knockout – The Monorepo for Knockout.js (4.0+)

http://www.tko.io

Other

274 stars 34 forks source link

Replace hard-coded parser unicode identifier detection with RegExp #144

Open brianmhunt opened 3 years ago

brianmhunt commented 3 years ago

This may become our min browser: Chrome 50

https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp/unicode

We can then get rid of the following 10kb:

https://github.com/knockout/tko/blob/4a0c9c0d03bb2ed53aa058672b5429d88bf92264/packages/utils.parser/src/identifierExpressions.js

brianmhunt commented 3 years ago

Noting: https://mathiasbynens.be/notes/javascript-identifiers-es6

In ES2015, identifiers must start with $, _, or any symbol with the Unicode derived core property ID_Start.

The rest of the identifier can contain $, _, U+200C zero width non-joiner, U+200D zero width joiner, or any symbol with the Unicode derived core property ID_Continue.

brianmhunt commented 3 years ago

Noting: https://github.com/tc39/proposal-regexp-unicode-property-escapes#other-examples

const regexIdentifierStart = /[$_\p{ID_Start}]/u;
const regexIdentifierPart = /[$_\u200C\u200D\p{ID_Continue}]/u;
const regexIdentifierName = /^(?:[$_\p{ID_Start}])(?:[$_\u200C\u200D\p{ID_Continue}])*$/u;

tscpp commented 3 years ago

I have not worked with unicode ranges before, but shouldn't the regex be able to be generated runtime (maybe as a fallback)? The website used to generate the regex clearly does it runtime. The regex generation may be slow, but I think it's better as a fallback than nothing. Am I missing something?