Closed nicolo-ribaudo closed 5 years ago
For what it's worth, regexp-tree
is written by @DmitrySoshnikov who works at Facebook, and he's one of the smartest people I know when it comes to parsers and regular expressions. 👍
After reading the Github readmes for regexp-tree
and regexpu-core
, it sounds like regexp-tree
could be more flexible? regexpu-core
seems to be designed for a single purpose.
non standard featues (like comments in patterns)
Is that non-standard? Nearly all implementations support it. regular-expressions.info says that it's just some very old or simplified implementations that don't support it: https://www.regular-expressions.info/freespacing.html
Of the flavors discussed in this tutorial, only XML Schema and the POSIX and GNU flavors don't support it. Plain JavaScript doesn't either, but XRegExp does
I wouldn't call it "non-standard", I'd call JavaScript's regex the non-standard one. It's lacking standard features that every other language has 😛
@Daniel15 For the purposes here, the only standard we care about is ECMAScript. And regexp-tree
certainly goes above and beyond what that standard requires.
Won't that make it easier to add new features in the future, though? It seems more future proof to use a library that supports more advanced regex features, particularly since it's likely that there'll be more requests to improve JS regex support such that it's on par with other languages. JS regexes have been lagging far behind for a very long time, so I'm happy to see these improvements :)
Hey guys,
Just an FYI, if you nevertheless prefer going with the regexp-tree
(which is fully based on ECMAScript) -- it is possible to apply just one this transform for named captured groups.
This is handled by the compat-transpiler module, which may have a whitelist of transforms to apply.
In particular, for the use-case:
const regexpTree = require('regexp-tree');
// Using new syntax.
const originalRe = '/(?<all>x)\\k<all>/';
// For legacy engines.
const compatTranspiledRe = regexpTree
.compatTranspile(originalRe, ['namedCapturingGroups'])
.toRegExp();
console.log(compatTranspiledRe); // /(x)\1/
The transform also returns the names of the captured groups, which can further be passed to the runtime module you mention. I use a similar approach with custom exec
method (see this unit test with accessing groups
property).
@Daniel15 it's not more future-proof if the features that land in the spec itself end up differing; it's quite critical that the default for a feature like this be "nothing beyond what's in the spec".
@ljharb, right, with the whitelist
parameter I mentioned, you may granularly pick only needed things from the spec (and nothing beyond).
But it's completely up to you which tool you choose of course -- I built regexp-tree to be "based on ECMAScript" (including it's parsing grammar, etc) + cooler features if one needs them, but they are not enforced, and you may restrict purely to ES spec.
@DmitrySoshnikov oh sure, i'm talking about for babel :-) your tool can choose whatever defaults it likes!
@ljharb, exactly for Babel ;) all the features are from ECMAScript, and can granularly be picked one by one. Everything what goes on top of it (x
flag, comments, multiline, etc) can be disabled.
After reading the Github readmes for regexp-tree and regexpu-core, it sounds like regexp-tree could be more flexible? regexpu-core seems to be designed for a single purpose.
IMHO it makes more sense to compare regjsparser with regexp-tree. regjsparser is the parser that regexpu uses — regexp-tree wasn’t available at the time regexpu was created.
regjsparser doesn’t yet know about named capture groups (or lookbehind assertions), so we’d have to teach it before regexpu can handle those.
Thank you all for your comments.
Everything what goes on top of it (x flag, comments, multiline, etc) can be disabled.
I think I'll use regexp-tree then, since it already supports named groups. If needed, we can always switch to regexpu-core
in the future.
Thanks to @nicolo-ribaudo’s work in https://github.com/mathiasbynens/regexpu-core/pull/14, regexpu-core now supports named groups as well.
Babel is going to support this feature? I have looked at babel packages and I haven't saw this plugin there.
@brneto Great question, I honestly don't know but am interested myself. According to https://github.com/babel/babel/pull/7105 the docs haven't been updated yet and some unit tests need to be fixed, so hopefully soon!
There is a similar issue in the custom https://github.com/DmitrySoshnikov/babel-plugin-transform-modern-regexp/issues/3 repo, though as @nicolo-ribaudo mentioned there, he'll try to take the original PR in Babel. @nicolo-ribaudo, are you still on track for the PR?
+1!
Named capture groups are being merged in the main spec (https://github.com/tc39/ecma262/pull/1027). We need a plugin for that :slightly_smiling_face:
There is already https://www.npmjs.com/package/babel-plugin-transform-modern-regexp, but it does a lot of things. I think we should have a plugin which only transforms named groups (like we have for every other regexp feature).
It would consist in two parts: 1) The pattern transpiler e.g.
(?<name>.)\k<name>
->(.)\1
We have two options for this: regexp-tree and regexpu-core. I would prefer using regexpu-core, because:regexp-tree parses by default some non standard featues (like comments in patterns)
On the other hand, it would require some more time because first I need to implement named groups in
regexpu-core
. The plugin might not be ready before the proposal is merged into the spec.2) The runtime wrapper This is needed to add the
groups
property to the result of.match
/.exec
/.replace
. I think the implementation which gives best browser support requires overwritingRegExp#exec
andString#replace
. Another option which doesn't requires modifying builtins is to use a class like this:@babel/babel Thoughts? (especially about
regexp-tree
vsregexpu-core
)cc @mathiasbynens