Open mathiasbynens opened 11 years ago
Sounds great - better performance, low impact, and compatible between node and the browser.
@jakl What kind of patch would you prefer?
I could create a new tools
directory and add a quick generate-regexes.js
file there that generates the regular expressions and writes their source to separate files (in a new data
directory), for example.
The next step would be to tweak the build script so it automatically inserts the contents of those files in the right places in the source code. I generally use grunt-template for that, but this project is using a Rakefile
so I can imagine you’d rather not introduce another “build script” layer.
+1
Also I would be interested to hear where these ranges come from. Is this just a listing of all code points in a given Unicode category/script/block/…? Cause that would make things even easier.
E.g. this:
With Regenerate, this could become:
But it would be even better to not do it at runtime, but as part of a build process:
This way, the source code (before building) is still very readable/maintainable, but the built code is optimized for run-time performance.
Note that using Regenerate would also solve this problem with astral symbols:
Would you be interested in a pull request that ports all the regular expressions to Regenerate + adds a simple build script?