interconnectit / Search-Replace-DB

This script was made to aid the process of migrating PHP and MySQL based websites. Works with most common CMSes.
https://interconnectit.com/products/search-and-replace-for-wordpress-databases/
GNU General Public License v3.0
3.99k stars 850 forks source link

Escaped URLs are not found #365

Open GigiSan opened 3 years ago

GigiSan commented 3 years ago

Hello everyone,

Since v4 it looks like escaped URLs are not found properly.

Consider a row with this value: {"action":"console","days":90,"prev":{"country":"all","profile":"https:\/\/testdomain.com\/"}}

If I just search for testdomain.com and replace it with anotherdomain.com, this row gets found correctly.

But if I have to replace the full URL because, for example, I have to change protocol and turn https:\/\/testdomain.com into http:\/\/anotherdomain.com, the search string is not found in this case.

Some more deatils on the matter:

If you need any further info please let me know. Meanwhile, thanks in advance for your time and have a nice day! Regards, -Gigi

itajackass commented 3 years ago

same problem for me

amelhorn commented 3 years ago

We can replicate the issue also - any url containing a '/' or '\' is no longer found in version 4

GigiSan commented 3 years ago

UPDATE: The web version seems to have some problems with the '/' delimiters it prepends/appends to the regex, I actually modified those to '#' in the srdb code.


I have found a temporary solution, maybe even permanent for me, which is using regex, as regex search-replace doesn't seem to be affected by the problem.

I will give you the example I used, in case someone has a similar requirement:

Search: (http(s)?)((\:\/\/)|(\:\\\/\\\/)|((\:|\%3A)\%2F\%2F))(testdomain\.com) (check use regex, then Case insensitive and Multiline flags)

Replace: http$3anotherdomain.com

This will replace:

https://testdomain.com
http://testdomain.com
https:\/\/testdomain.com
http:\/\/testdomain.com
https:%2F%2Ftestdomain.com
http:%2F%2Ftestdomain.com
https%3A%2F%2Ftestdomain.com
http%3A%2F%2Ftestdomain.com

respectively with:

http://anotherdomain.com
http://anotherdomain.com
http:\/\/anotherdomain.com
http:\/\/anotherdomain.com
http:%2F%2Fanotherdomain.com
http:%2F%2Fanotherdomain.com
http%3A%2F%2Fanotherdomain.com
http%3A%2F%2Fanotherdomain.com

Which makes WordPress sites imported locally work flawlessly no matter how many plugins there are. 🙂 You can find an explanation of the regex HERE. Hope this helps someone while a solution is found.

mkjmdski commented 2 years ago

@GigiSan solution works great! The tricky part is to pass regex that complex from bash script level. I've wrote a simple function which is responsible for running search-and-replace from command line

First argument is old url we want to replace, second the new one and third the protocol (http or https) we want to start using.

run_replace() {
    existing_domain="$1"
    new_domain="${2}"
    new_protocol="${3}"
    srdb_root_path='/usr/local/lib/search-and-replace'
    if echo $PHP_VERSION | grep -q '5.6' || echo $PHP_VERSION | grep -q '7.1'; then
        srdb_path="${srdb_root_path}/3.1"
    else
        srdb_path="${srdb_root_path}/4.1"
    fi
    # regex from: https://github.com/interconnectit/Search-Replace-DB/issues/365
    php "${srdb_path}/srdb.cli.php" \
        --host $MYSQL_HOSTNAME \
        --user $MYSQL_USERNAME \
        --pass $MYSQL_PASS \
        --port $MYSQL_PORT \
        --name $MYSQL_DATABASE \
        --search "#$(echo '(http(s)?)((\\:\\/\\/)|(\\:\\\\\\/\\\\\\/)|((\\:|\\%3A)\\%2F\\%2F))('"$(echo $existing_domain | sed 's|\.|\\\\\.|g')"')')#" \
        --replace $(echo $new_protocol'$3'$new_domain) \
        --debug --verbose --regex
}