tc39 / proposal-regexp-match-indices

ECMAScript RegExp Match Indices
https://arai-a.github.io/ecma262-compare/?pr=1713
BSD 3-Clause "New" or "Revised" License
64 stars 13 forks source link

Naming alignment — option field and resulting field names #15

Closed SMotaal closed 5 years ago

SMotaal commented 5 years ago

It is exciting to see novel work on the exec interface being also forward-looking. I am sure at least a few have dared to explore this space so it makes sense to expect a few more proposals coming this way.

That said, can I propose reconsidering the parametric aspects further here.

I am particularly wondering that with exec(…, options) being an object it might (imho) lend nicer to change the parameter to options.captureOffsets = true.

Not sure if that is easier to optimize engine-side, but JS-side, and at least if more potential options come this way, boolean checks will likely be easier to work with (and review).

It also gives a nice parallelism between options.capture… and match.… which might lessen the cognitive burdens, especially for developers that generally try to avoid RegExp because of its mystery.

Certainly there is more to the naming which I might have missed (apologies if this is redundant).

Thanks

ljharb commented 5 years ago

That does make them harder to annotate in type systems like TS and Flow; typically enums are preferred over booleans as well because then you can add a third option in the future.

SMotaal commented 5 years ago

I personally cannot talk about internals, but also feel a little hesitant to think it technically ill-advised (ie it is important enough to force on humans something so that machine does not error).

Irrespective, I guess the more immediate concern I am getting at, for someone not with the same perspective looking at { capture: 'indices' } being what it takes to end up with { offsets: something[] } is seemingly not as straightforward — more so of { capture: 'strings' } being the default and what that means.

My fear is that capture in this context is too generic, in the sense that down the road something might be proposed that could more intuitively relate to capture but will need yet another somewhat confusing (to one or more that are not the authors) burden.

Do you agree this is a valid concern generally — to at least want to put out feelers on this? (my opinion being just one that made its way to you of possibly more out there)

rbuckton commented 5 years ago

capture is the term used for what a RegExp does for (pattern) (a capture group) and (?<name>pattern) (a named capture group). It is also consistent with the terminology used in other languages and in the Prior Art referenced in the explainer.

{ capture: 'indices' } being what it takes to end up with { offsets: something[] }

That is not what happens. { capture: 'indices' } indicates that the captures should only be pairs of indices:

const re = /a*(?<Z>z)?/;
const input = "xaaaz";
const match1 = re.exec(input, { capture: 'strings' }); // or just re.exec(input)
console.log(match1); // ["aaaz", "z"]

const match2 = re.exec(input, { capture: 'indices' });
console.log(match2); // [[1, 5], [4, 5]]

This differs from the alternative version of proposal that instead adds an additional indices property to the result.

SMotaal commented 5 years ago

@rbuckton — thanks for the added clarification… to be fair, I'll need to reflect more on this myself before arguing oneway or another, and at least hope my input here being one from the end-user perspective is of value somehow.

rbuckton commented 5 years ago

I'm closing this as the consensus decision was to switch back to a property on the result from RegExpBuiltinExec. As a result we are no longer using an options object.