jamadden / mrab-regex-hg

Automatically exported from code.google.com/p/mrab-regex-hg
0 stars 2 forks source link

search and replace two steps function #114

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
Hi, don't too serious about this issue.

This small function does two steps:
1, search string with regular expression.
2, replace the results from the first step.

It's very simple, but useful in some cases.

Here is a sample, add '-' to the key-word in a dictionary.
------------------------------------------
# coding=utf-8

import regex as re

def search_and_replace(string,
                       search_regex, search_flags,
                       replace_regex, replace_flags,
                       replace_string):
    '''two steps: 1, search. 2, replace the first step's results.'''

    def g(string,
          search_regex, search_flags,
          replace_regex, replace_flags,
          replace_string):
        '''generator'''

        search_pattern = re.compile(search_regex, search_flags)
        replace_pattern = re.compile(replace_regex, replace_flags)

        last_position = 0
        for i in search_pattern.finditer(string):
            one_start = i.start()
            one_end = i.end()

            # unreplaced string
            yield string[last_position:one_start]

            sub_string = replace_pattern.sub(replace_string,
                                             string[one_start:one_end]
                                             )
            # replaced string
            yield sub_string

            last_position = one_end

        # unreplaced end
        yield string[last_position:]

    i = g(string,
          search_regex, search_flags,
          replace_regex, replace_flags,
          replace_string)
    return ''.join(i)

text = ('dying a.垂死的,临终的\n'
        'dye vt.&vi.染,染色 n.染料,染色\n'
        'duty n.责任,义务;职责,职务;税,关税\n'
        'dusty a.多灰尘的,灰蒙蒙的;粉末状的;灰色的\n'
        'dustbin n.垃圾箱')

result = search_and_replace(text,
                       r'^\w+', re.M,
                       r'(?<=\w)(?=\w)', 0,
                       r'-')
print(text)
print('------------')
print(result)

Original issue reported on code.google.com by animaliz...@gmail.com on 21 May 2014 at 1:11

GoogleCodeExporter commented 9 years ago
I don't think that there's sufficient demand for it. Anyway, it can be 
shortened:

def search_and_replace(string,
                       search_regex, search_flags,
                       replace_regex, replace_flags,
                       replace_string):

    '''two steps: 1, search. 2, replace the first step's results.'''

    def replacement(match):
        return re.sub(replace_regex, replace_string, match.group(), flags=replace_flags)

    return re.sub(search_regex, replacement, string, flags=search_flags)

Original comment by re...@mrabarnett.plus.com on 21 May 2014 at 2:12