tc39 / proposal-regex-escaping

Proposal for investigating RegExp escaping for the ECMAScript standard
http://tc39.es/proposal-regex-escaping/
Creative Commons Zero v1.0 Universal
368 stars 32 forks source link

New library with context-aware escaping via a template tag #79

Open slevithan opened 4 months ago

slevithan commented 4 months ago

I would be happy to see RegExp.escape exist (in fact, I advocated for it in several comments at the top of the years-old es-discuss thread mentioned in this proposal's motivation, and I’ve been shipping XRegExp.escape for > 15 years). But there are some advantages to escaping via interpolation in a template tag that have been discussed in depth in previous issues (including #37 and #45).

I created a new library, Regex.make, that among other features includes robust support for context-aware escaping via interpolation. It improves upon the 2015 proof of concept of this feature (regexp-make-js) by @mikesamuel and @erights in several ways, including full support of ES2024 regexes, avoiding or fixing various edge cases, and offering context-aware escaping, sandboxing, and atomization of interpolated values in a greater number of contexts (including e.g. on range or set operation boundaries and within enclosed tokens).

It additionally addresses the use case of composing a dynamic number of strings (among other use cases) via the concept of "partial pattern" strings that are interpolated in a context-aware way without escaping special regex characters.

I'm in the camp that a standardized template tag can coexist with a standardized RegExp.escape. I'm hoping that sharing this here leads to additional insight/discussion (which I acknowledge might only be tangentially related to RegExp.escape) or is at least an interesting reference point.

CC @domenic, @littledan, @ljharb.

ljharb commented 4 months ago

Thanks for the issue. Thankfully the committee has agreed that the two methods aren't mutually exclusive, and a follow-on proposal to add a template tag function would be appreciated :-)

(btw, https://www.npmjs.com/package/regex-make doesn't exist - is your new library published to npm so it's directly usable?)

slevithan commented 4 months ago

a follow-on proposal to add a template tag function would be appreciated :-)

I've written up some thoughts about a standardized template tag as a replacement for proposed flag x (that offers a new regex happy path), here: https://github.com/tc39/proposal-regexp-x-mode/issues/8

If there is someone willing to be the primary champion for a tag proposal, I'd be happy to collaborate or co-champion. (I wouldn't want to be the primary since some of the work involved, as well as the formal language of the spec, seems daunting to me given that I haven't been involved before.)

is your new library published to npm so it's directly usable?

I've now published it as npmjs.com > regex. I think the details of this library (minus its syntax extensions) could serve as a helpful starting point for a tag proposal.

ljharb commented 4 months ago

i am astonished that "regex" was available on npm, nice grab :-)

erights commented 4 months ago

Hi @slevithan , I am interested in this, but I cannot take the time to lead, or to write the spec text.

I am also interested in your "atomic groups" (?>...) at https://github.com/slevithan/regex-make?tab=readme-ov-file#atomic-groups to help avoid redos. Is this something that could be proposed as an extension to the builtin regexp syntax? What are pros and cons?

Attn @waldemarhorwat

bakkot commented 4 months ago

@erights There's a stage 1 proposal for atomic groups from @rbuckton here.

erights commented 4 months ago

Thanks. I am enthused about this!

Discussion with @rbuckton proceeding at https://github.com/tc39/proposal-regexp-x-mode/issues/8#issuecomment-2150916948

slevithan commented 4 months ago

i am astonished that "regex" was available on npm, nice grab :-)

It wasn't. :-) I'd been in discussion with the prior owner about repurposing it, but that took time, which is why I hadn't yet published to npm when I first posted.

@erights, glad to know you're interested.