jamadden / mrab-regex-hg

Automatically exported from code.google.com/p/mrab-regex-hg
0 stars 2 forks source link

non-greedy quantifier in lookbehind #71

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
Hi,
I just noticed some irregularity while using lookbehinds containing greedy and 
non-greedy quantifiers, howewer, I don't know the specification for such cases 
(as re doesn't support variable-length lookbehind at all). cf.:

>>> regex.findall(r"(?<=:\S+ )\w+", ":9 abc :10 def")
['abc', 'def']
>>> regex.findall(r"(?<=:\S+? )\w+", ":9 abc :10 def")
['def']

i.e. \S+? somehow doesn't match a single digit; (\S*? does).

I'd expect that both quantifiers would match in the same way in this case.
And (probably?) the pattern would match consistently with the one where the 
lookbehind part is changed to be part of the matched text:

>>> regex.findall(r":\S+ \w+", ":9 abc :10 def")
[':9 abc', ':10 def']
>>> regex.findall(r":\S+? \w+", ":9 abc :10 def")
[':9 abc', ':10 def']
>>> 

(Using regex-0.1.20120613, python 2.7.3, Win 7.)

best regards,
          vbr

Original issue reported on code.google.com by Vlastimil.Brom@gmail.com on 5 Jul 2012 at 5:00

GoogleCodeExporter commented 9 years ago
Fixed in regex 0.1.20120705.

Original comment by re...@mrabarnett.plus.com on 5 Jul 2012 at 6:30

GoogleCodeExporter commented 9 years ago
Wow, that was fast...,
 thanks for the instant fix!
       vbr

Original comment by Vlastimil.Brom@gmail.com on 5 Jul 2012 at 7:36