jamadden / mrab-regex-hg

Automatically exported from code.google.com/p/mrab-regex-hg
0 stars 2 forks source link

Forward references; nested references? #21

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
I'd like to ask about the support for forward references and nested references 
in regex ( http://www.regular-expressions.info/brackets.html ).
I couldn't find any notice of this in the documentation, but it seems, that 
forward references are supported, while nested references are not:

>>> regex.search(r"(\2b|(a))+", "-aab-").group()
'aab'
>>> 
>>> regex.search(r"(\1b|(a))+", "-aab-").group()
Traceback (most recent call last):
  File "<input>", line 1, in <module>
  File "C:\Python27\lib\regex.py", line 235, in search
    return _compile(pattern, flags, kwargs).search(string, pos, endpos,
  File "C:\Python27\lib\regex.py", line 423, in _compile
    parsed = parse_pattern(source, info)
  File "C:\Python27\lib\_regex_core.py", line 334, in parse_pattern
    branches = [parse_sequence(source, info)]
  File "C:\Python27\lib\_regex_core.py", line 350, in parse_sequence
    item = parse_item(source, info)
  File "C:\Python27\lib\_regex_core.py", line 363, in parse_item
    element = parse_element(source, info)
  File "C:\Python27\lib\_regex_core.py", line 587, in parse_element
    element = parse_paren(source, info)
  File "C:\Python27\lib\_regex_core.py", line 723, in parse_paren
    subpattern = parse_pattern(source, info)
  File "C:\Python27\lib\_regex_core.py", line 334, in parse_pattern
    branches = [parse_sequence(source, info)]
  File "C:\Python27\lib\_regex_core.py", line 350, in parse_sequence
    item = parse_item(source, info)
  File "C:\Python27\lib\_regex_core.py", line 363, in parse_item
    element = parse_element(source, info)
  File "C:\Python27\lib\_regex_core.py", line 584, in parse_element
    return parse_escape(source, info, False)
  File "C:\Python27\lib\_regex_core.py", line 1035, in parse_escape
    return parse_numeric_escape(source, info, ch, in_set)
  File "C:\Python27\lib\_regex_core.py", line 1069, in parse_numeric_escape
    raise error("can't refer to an open group")
error: can't refer to an open group
>>> 

Is it true, or am I misinterpretting something?

(re fails with the same error message for the second pattern and "bogus escape: 
'\\2'" for the first one.)

thanks and regards,
   vbr

Original issue reported on code.google.com by Vlastimil.Brom@gmail.com on 26 Sep 2011 at 9:19

GoogleCodeExporter commented 9 years ago
Referring to an open group is disallowed because re disallows it.

Perl seems to allow it, but I'll have to check exactly what happens and what 
changes I may need to make.

Should it be allowed for version 0 behaviour, or just version 1?

Original comment by re...@mrabarnett.plus.com on 26 Sep 2011 at 11:35

GoogleCodeExporter commented 9 years ago
I am actually not sure, what should fall under version-1-only functionality. If 
there is some indication, that in re this is disallowed by design, rather than 
simply not supported (like in many other cases compared to regex), then it 
would go into V1.
Given the error message in re "bogus escape:...", it seems to be taken as 
unknown group, whereas regex seems to already recognise it.

Personally I can't think of a compatibility problem, where an existing code for 
written for re would require getting an exception rather then performing a 
valid match (maybe except for some tests?)
Anyway, I guess, the "allowed" feature set for V0 will be a rather "political" 
decision, it seems; I for my part am grateful for all of the improvments and am 
going to use V1 exclusively, if possible.
vbr

Original comment by Vlastimil.Brom@gmail.com on 27 Sep 2011 at 6:16

GoogleCodeExporter commented 9 years ago
regex 0.1.20110927 now lets you refer to an open group in version 1.

Original comment by re...@mrabarnett.plus.com on 27 Sep 2011 at 8:02