bruderstein / PythonScript

Python Script plugin for Notepad++
http://npppythonscript.sourceforge.net/
GNU General Public License v2.0
354 stars 66 forks source link

rereplace side effect of performance tweak #352

Open chuckbecker opened 5 days ago

chuckbecker commented 5 days ago

In the docs for rereplace, you say:

An small point to note, is that the replacements are first searched, and then all replacements are made. This is done for performance and reliability reasons. Generally this will have no side effects, however there may be cases where it makes a difference. (Author’s note: If you have such a case, please post a note on the forums such that it can be added to the documentation, or corrected).

Well, I found a situation: When the callback function changes the text in addition to returning the replacement, that can change the position of future matches.

My use case is that I have a json file where the structure includes a bunch of "id" values. Those ids are hierarchical, so "6.1" should live inside "6". Currently the json file has the ids all out of order. I want to go through the file, and for each integer id (eg, "6") I want to re-name them according to the order they show up in the file, but I also want to rename all the child ids (eg, "6.1") so that they stay together with their parents.

So I want to go from something like this:

    {
      "type": "html",
      "id": 15,
      "formId": 73,
    },
    {
      "type": "text",
      "id": 6,
      "formId": 73,
      "inputs": [
        {
          "id": "6.2",
          "label": "Prefix",
          "name": "",
        },
        {
          "id": "6.3",
          "label": "First",
          "name": "",
        }
      ]
    },
    {
      "type": "text",
      "id": 14,
      "formId": 73
    }

to this:

    {
      "type": "html",
      "id": 1,
      "formId": 73,
    },
    {
      "type": "text",
      "id": 2,
      "formId": 73,
      "inputs": [
        {
          "id": "2.2",
          "label": "Prefix",
          "name": "",
        },
        {
          "id": "2.3",
          "label": "First",
          "name": "",
        }
      ]
    },
    {
      "type": "text",
      "id": 3,
      "formId": 73
    }

Here's the python code I'm using:

i = 1   
def change(match):
    global i
    editor.rereplace(r'"id": "' + match.group(2) + r'\.(\d+)"', r'"id": "' + str(i) + '.' + r'\1"') 
    i = i + 1
    return match.group(1) + str(i - 1) + ','

editor.rereplace('("id": )(\d+),', change)

Notice the callback function ( change() ) also performs another replace before it returns the replacement value, but it should only affect the file after the current match. Since the current implementation apparently gets the positions of all the matches before applying the replacements (I'm assuming that to be the case?), that causes the result to look something like this:

    {
      "type": "html",
      "id": 1,
      "formId": 73,
    },
    {
      "type": "text",
      "id": 2,
      "formId": 73,
      "inputs": [
        {
          "id": "2.2",
          "label": "Prefix",
          "name": "",
        },
        {
          "id": "2.3",
          "label": "First",
          "name": "",
        }
      ]
    },
    {
      "type": "text",
      "id": 14,
    3,"formId": 73
    }

Notice that in the last object, the "14" has not been replaced, and the "3" gets inserted in the wrong place.

I get that maybe I'm using the the callback function inappropriately, but since you asked for examples, I figured I'd pass this along.

Perhaps there could be an optional parameter that turns off the performance algorithm of rereplace ?

alankilborn commented 5 days ago

I'm using the the "callback" function inappropriately

(Note: quotes added because callback has a different meaning in PythonScript.)

I strongly agree with this statement. Perhaps the documentation should be changed to say don't do this specifically, or more generally that the replace should not change the text of the document. It should only supply the text that the current match will be replaced with.