tc39 / proposal-regexp-match-indices

ECMAScript RegExp Match Indices
https://arai-a.github.io/ecma262-compare/?pr=1713
BSD 3-Clause "New" or "Revised" License
64 stars 13 forks source link

Consider removing internal slots from GetIndices #28

Closed schuay closed 4 years ago

schuay commented 5 years ago

GetIndices is currently specified as a builtin function with internal slots [[InputString]], [[MatchIndices]], [[MatchIndicesArray]]. This requires allocation of a new function for each regexp result object.

Have you considered adding any required slots to the results object instead, and looking them up from the receiver when GetIndices is called?

cc @joshualitt @mathiasbynens

rbuckton commented 5 years ago

I have no issue with modifying the proposal if it becomes necessary to do so. I authored GetIndices in this way as the ECMAScript spec had readily available mechanisms for defining a built-in function in this fashion. @ljharb, @bterlson: Do you have any thoughts on this approach?

ljharb commented 5 years ago

If there’s not a new function for each result object, then it’s either frozen or a global communications channel. We’ve typically provided new functions each time rather than sharing a global frozen function.

ljharb commented 5 years ago

(Another alternative is a prototype method that looks up slots, but that’s not available for match objects since they’re normal arrays)

joshualitt commented 5 years ago

Unfortunately, creating a new closure for each result object will add overhead to every regexp.exec call, regardless of whether or not indices are actually used. If we go with the global frozen function, then indices are nearly free unless they are used. Does it make sense to weigh efficiency concerns here?

EDIT: Actually maybe I misspoke and we can wrap the result in a closure lazily. Still might be worth considering the additional cost of the per result closure.

ljharb commented 5 years ago

Additionally, if the slots are on the match object, it makes the object exotic - as currently specified, it’s just the getter function that has special behavior, and the match object remains ordinary.

syg commented 5 years ago

What was the motivation for reflecting the laziness of .indices in the spec itself? The root of the problem to me here is that reflection created an observable thing (i.e. that the .indices property descriptor is now a getter with different identity for each results object) that ties implementations' hands. I would much rather prefer it to be a vanilla data property and the array created eagerly in spec. An implementation can still choose to lazily allocate the array.

rbuckton commented 5 years ago

If implementers believe it should just be a data property then I'm fine with the change. The goal was to avoid the upfront penalty for initializing the value given the previous reticence towards possibly negatively impacting all existing RegExp uses.

syg commented 5 years ago

Sounds good. I'll draft a PR soon.

rbuckton commented 4 years ago

Fixed by #31