firasdib / Regex101

This repository is currently only used for issue tracking for www.regex101.com
3.21k stars 198 forks source link

Effect of the 'g' flag on capture groups is not respected #2031

Closed DovJacobson closed 1 year ago

DovJacobson commented 1 year ago

Thanks foŕ this webapp -- I use it all the time & I lové it. However this issue threw me under the bus for a while today:

In Javascript (at least) match() returns an array if using the /g flag, this is an array of all the matches if not, it is an array of: the first match + capture groups

This is is not reflected in your UI, where you show results from the capture groups even when the /g flag has disabled them.

InSyncWithFoo commented 1 year ago

Just imagine that it would use .matchAll() instead of just .match(). This site is about regex, not JavaScript or any languages' API.

DovJacobson commented 1 year ago

@InSyncWithFoo 1: JS was just an example. I dunno whether the same result is true in other languages 2: This sité is not about regex. It is about Regex101 3: This bug was observed with the ECMAScript flavor anď the match() function selected. Very relevant. To its credit, Regex101 does indeed care about 'any language's API' 4: If your message means that matchAll() returns an array of arrays (sort of) that contains all the capture groups for all the matches, yes, you are right and this is a handy thing to know. Thank you. But it is not really relevant. Regex101 currently misreports the results of match()with the /g flag. I love Regex101 and want it to be even more perfecter than it already is!

working-name commented 1 year ago

Hi @DovJacobson,

I can see where you'd think the Match tab is the function name in javascript, but it has nothing to do with it. You'll notice that the same label remains no matter what flavor you select. For example, selecting PHP doesn't make it change to preg_match() and then preg_match_all() if you enable the /g flag.

firasdib commented 1 year ago

Not sure I completely follow. Every flavor will always return the first match (including the capture groups from said match) when the g flag is not enabled. If the g flag is enabled, all matches are returned.

The behavior also seems to be in line with the match function in Javascript, e.g.

"abcdef".match(/.(.)(.)/) // ['abc', 'b', 'c']
DovJacobson commented 1 year ago

I am trying to be clear, @firasdib (anď Thank you for Regex101!!)

Bug: When the g flag is used in an Ecmascript match(), capture groups are ignored. Regex101 indicates that they are respected.

Demo: Here are the results that you anticipate correctly in the vanilla regex followed by the entirely different results observed with the g flag appended

"abcdef".match(/.(.)(.)/) 
(3) ['abc', 'b', 'c', index: 0, input: 'abcdef', groups: undefined]

"abcdef".match(/.(.)(.)/g) 
(2) ['abc', 'def']`

(Demo performed in the Chromé DevTool console while typing this post:)

firasdib commented 1 year ago

@DovJacobson regex101 does not utilize the match function of javascript. Instead you should use either exec or matchAll.

DovJacobson commented 1 year ago

Thank you @firasdib I was not looking for a workaround. I was hoping to improve a tool I love. Users trust Regex101 to predict the results of an expression in the users environment, I over-relied on it - and it cost me some real time. I will not have this problem again. Others will.

But to follow through-- you are correct, of course, about matchAll() and exec() works perfectly. Thanks for advice & good luck.

[..."abcdef".matchAll(/.(.)(.)/g)]
...
0:  (3) ['abc', 'b', 'c', index: 0, input: 'abcdef', groups: undefined]
1:  (3) ['def', 'e', 'f', index: 3, input: 'abcdef', groups: undefined]
...

Dov out.