interconnectit / Search-Replace-DB

This script was made to aid the process of migrating PHP and MySQL based websites. Works with most common CMSes.
https://interconnectit.com/products/search-and-replace-for-wordpress-databases/
GNU General Public License v3.0
4k stars 855 forks source link

Regex with backreferences not doing any replacing #247

Closed dleigh closed 4 years ago

dleigh commented 6 years ago

First of all let me express my gratitude for an amazing tool that simply outshines all the competing tools! Thank you!

Secondly, I'd love to see some specific info with respect the regex implementation (given the huge number of dialects out there and the extra flags - I'm guessing it's PCRE PHP, but I've seen references to JavaScript as well). Anyway, just knowing a bit more about what it's trying to do under the covers would help in creating the regex string.

My issue: I have some posts in WordPress that have a consistently formatted tag area at the end of each post that I want to get rid of. I have the beginning paragraph with a unique class to start with and the end paragraph tag to end with and it's at the end of the post. I "find" the whole post and then use backreferences to try and replace it with only the beginning of the post, discarding the tags section.

It seems to "find" the posts correctly, but when I preview the changes in a dry run, nothing has changed. My posts are long, but let me mock up a post and show you what I'm looking for and what I'm getting:

Initial post:

<p>Here is my blog post content</p> <p class="zoundry_bw_tags"> <!-- Tag links generated by Zoundry Blog Writer. Do not manually edit. https://www.zoundry.com --> <span class="ztags"><span class="ztagspace">Technorati</span> : <a href="https://technorati.com/tag/branch" class="ztag" rel="tag">branch</a> </span> <br /> <span class="ztags"><span class="ztagspace">Del.icio.us</span> : <a href="https://del.icio.us/tag/branch" class="ztag" rel="tag">branch</a></span> <br /> <span class="ztags"><span class="ztagspace">Ice Rocket</span> : <a href="https://www.icerocket.com/search?tab=blog&amp;fr=h&amp;q=branch" class="ztag" rel="tag">branch</a></span> </p>

What I want to see after replacing is:

<p>Here is my blog post content</p>

Instead, what I'm seeing is just the whole post again - no changes shown in the preview.

My regex string is:

(.*?)<p class="zoundry_bw_tags">(.*?)<\/p>(.*?)

My backreference string is:

$1

You can see that it works here:

https://regex101.com/r/ZnWRW5/1

If I'm doing something wrong, please let me know. Otherwise, is this a bug? Thanks!

gianluigi-icit commented 4 years ago

hello!

I double checked few things, the problem is how the regex is crafted, here a working example

php srdb.cli.php -u dbuser -p dbpass -n banana -s '/.*<p class="zoundry_bw_tags">(.*?)<\\\/p>.*/' -r '$1' -h standfirst-mariadb --regex true

Before

<p>Here is my blog post content</p> <p class="zoundry_bw_tags"> <!-- Tag links generated by Zoundry Blog Writer. Do not manually edit. https://www.zoundry.com --> <span class="ztags"><span class="ztagspace">Technorati</span> : <a href="https://technorati.com/tag/branch" class="ztag" rel="tag">branch</a> </span> <br /> <span class="ztags"><span class="ztagspace">Del.icio.us</span> : <a href="https://del.icio.us/tag/branch" class="ztag" rel="tag">branch</a></span> <br /> <span class="ztags"><span class="ztagspace">Ice Rocket</span> : <a href="https://www.icerocket.com/search?tab=blog&amp;fr=h&amp;q=branch" class="ztag" rel="tag">branch</a></span> </p>

After

 <!-- Tag links generated by Zoundry Blog Writer. Do not manually edit. https://www.zoundry.com --> <span class="ztags"><span class="ztagspace">Technorati</span> : <a href="https://technorati.com/tag/branch" class="ztag" rel="tag">branch</a> </span> <br /> <span class="ztags"><span class="ztagspace">Del.icio.us</span> : <a href="https://del.icio.us/tag/branch" class="ztag" rel="tag">branch</a></span> <br /> <span class="ztags"><span class="ztagspace">Ice Rocket</span> : <a href="https://www.icerocket.com/search?tab=blog&amp;fr=h&amp;q=branch" class="ztag" rel="tag">branch</a></span> 
dleigh commented 4 years ago

Thank you! Regex is a deep well and I rarely take a drink. I'm glad there are those who know it well!

------ Original Message ------ From: "gianluigi-icit" notifications@github.com To: "interconnectit/Search-Replace-DB" Search-Replace-DB@noreply.github.com Cc: "David Leigh" david@leighweb.com; "Author" author@noreply.github.com Sent: 2020-04-16 11:44:16 AM Subject: Re: [interconnectit/Search-Replace-DB] Regex with backreferences not doing any replacing (#247)

hello!

I double checked few things, the problem is how the regex is crafted, here a working example

php srdb.cli.php -u dbuser -p dbpass -n banana -s '/.

(.?)<\\/p>.*/' -r '$1' -h standfirst-mariadb --regex true Before

Here is my blog post content

Technorati :
Del.icio.us :
Ice Rocket :

After

Technorati :
Del.icio.us :
Ice Rocket : — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub , or unsubscribe .