zeusdeux / re2

Automatically exported from code.google.com/p/re2
BSD 3-Clause "New" or "Revised" License
0 stars 0 forks source link

No support for optional groups #48

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?

Write a regex with an optional capture group and match it against some text 
that should match using FullMatchN().
Example:
regex  "ignore\s(take)(?:\s(take_if_present))?"
text 1 "ignore take"
text 2 "ignore take take_if_present"

What is the expected output? What do you see instead?

Both texts should match. The regex works for example within RegExpEditor. The 
missing capture group should be reported in some way, maybe returning an empty 
string or adding a per-capture group boolean status, or in any other way.

What version of the product are you using? On what operating system?

I just updated to the current source version. I'm not sure about how to use 
Mercurial, but hg tip reports 65:a9f0eaee31d7 as the changeset. My OS is 
Mandriva Linux 2011, I'm building with gcc 4.6.1.

Please provide any additional information below.

I didn't test the code with the example I reported, so here is real data taken 
from a .obj file (3D mesh):

vt 0.458942 0.532979
vt 0.459099 0.533016 0

and my hardcoded regex:

(-?\\d+(?:\\.\\d+)?)\\s+(-?\\d+(?:\\.\\d+)?)(?:\\s+(-?\\d+(?:\\.\\d+)?))?\\s*$

This same regex works in RegExpEditor (of course with unescaped \) with both 
lines (the last float at the end is optional), but FullMathN only returns 
success for the second line.

Original issue reported on code.google.com by mich...@gmx.tm on 7 Sep 2011 at 11:11

GoogleCodeExporter commented 9 years ago
After discussing with one of the authors, it seems that for types different 
than StringPiece this is a somewhat expected behaviour. Since I'm parsing 
floats, the code whithin FullMatch fails, but being the return type a boolean 
there's no way to know if it was an internal failure or a real regex mismatch.
The solution is to always use a StringPiece whenever you may have empty 
capturing groups (I imagine this problem is also present in the "(.*)" case), 
and then do the type conversion yourself. This is especially necessary if the 
regex come from outside your program.
I hope this will save some time for those who end up in the same situation :)

Original comment by mich...@gmx.tm on 8 Sep 2011 at 11:17

GoogleCodeExporter commented 9 years ago

Original comment by rsc@golang.org on 10 Jan 2014 at 1:14