python / cpython

The Python programming language
https://www.python.org
Other
63.31k stars 30.3k forks source link

hierarchical regular expression #41783

Closed 78c2223d-7d5e-43ac-8a52-dca764339127 closed 19 years ago

78c2223d-7d5e-43ac-8a52-dca764339127 commented 19 years ago
BPO 1174589
Nosy @loewis

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields: ```python assignee = None closed_at = created_at = labels = ['library'] title = 'hierarchical regular expression' updated_at = user = 'https://bugs.python.org/ottrey' ``` bugs.python.org fields: ```python activity = actor = 'loewis' assignee = 'none' closed = True closed_date = None closer = None components = ['Library (Lib)'] creation = creator = 'ottrey' dependencies = [] files = [] hgrepos = [] issue_num = 1174589 keywords = ['patch'] message_count = 6.0 messages = ['48105', '48106', '48107', '48108', '48109', '48110'] nosy_count = 2.0 nosy_names = ['loewis', 'ottrey'] pr_nums = [] priority = 'normal' resolution = 'rejected' stage = None status = 'closed' superseder = None type = None url = 'https://bugs.python.org/issue1174589' versions = [] ```

78c2223d-7d5e-43ac-8a52-dca764339127 commented 19 years ago

( from the re2 sourceforge project http://pyre2.sourceforge.net )

The re2 library provides a hierarchical regular expression extension to the re library.

re2 extracts a hierarchy of named groups from a string, rather than the flat, incomplete dictionary that the standard re module returns.

>>> import re
>>> buf='12 drummers drumming, 11 pipers piping, 10
lords a-leaping'
>>> regex='^((?P<verse>(?P<number>\d+)
(?P<activity>[^,]+))(, )?)*$'
>>> pat1=re.compile(regex)
>>> m=pat1.match(buf)
>>> m.groupdict()
{'verse': '10 lords a-leaping', 'number': '10',
'activity': 'lords a-leaping'}

>>> import re2
>>> buf='12 drummers drumming, 11 pipers piping, 10
lords a-leaping'
>>> regex='^((?P<verse>(?P<number>\d+)
(?P<activity>[^,]+))(, )?)*$'
>>> pat2=re2.compile(regex)
>>> x=pat2.extract(buf)
>>> x
{'verse': [{'number': '12', 'activity': 'drummers
drumming'}, {'number': '11', 'activity': 'pipers
piping'}, {'number': '10', 'activity': 'lords a-leaping'}]}

(See http://pyre2.sourceforge.net/ for more details.)

61337411-43fc-4a9c-b8d5-4060aede66d0 commented 19 years ago

Logged In: YES user_id=21627

Is this a patch? If so, where is the code, and are you its author?

78c2223d-7d5e-43ac-8a52-dca764339127 commented 19 years ago

Logged In: YES user_id=609576

Sorry, it's more an extension than a patch. (Although maybe it could be applied as a patch to the re library.) (BTW Where is the correct place to submit extensions?)

The code is in this subversion repository: http://pintje.servebeer.com/svn/pyre2/trunk/

Or available for download here: http://sourceforge.net/project/showfiles.php?group_id=134583

And has a development wiki here: http://py.redsoft.be/pyre2/wiki/

And yes, I'm the author.

61337411-43fc-4a9c-b8d5-4060aede66d0 commented 19 years ago

Logged In: YES user_id=21627

We accept extensions only by means of patches. So you would create a patch (as a unified or context diff) for the current CVS;for completely new files, providing a tar ball is also reasonable. I expect that you don't suggest literal inclusion of the svn trunk directory into the dist/src directory of the Python distribution.

However, in the specific case, I think whether or not the new functionality should be added to Python at all probably needs discussion. I recommend to ask on python-dev; be prepared to write a PEP. As a starting point, I'm personally concerned to have a module named "re2" in the standard library. This will cause confusion to the users; it might be better to merge the functionality into the re module.

78c2223d-7d5e-43ac-8a52-dca764339127 commented 19 years ago

Logged In: YES user_id=609576

Ok. I'll ask on python-dev then.

61337411-43fc-4a9c-b8d5-4060aede66d0 commented 19 years ago

Logged In: YES user_id=21627

Given the discussion of python-dev, it appears that you want to rework the code and come back if you have something you'ld like to contribute. So I'm rejecting this patch for now; please open a new one when you are ready (but likely, you'ld write a PEP first, anyhow).