wp-cli / search-replace-command

Searches/replaces strings in the database.
MIT License
57 stars 45 forks source link

Replacing all occurrences requires multiple runs #164

Closed stefanfisk closed 2 years ago

stefanfisk commented 2 years ago

Bug Report

Describe the current, buggy behavior

When running search-replace commands on a quite large DB, not all replacements are made. Running the command in a loop until it outputs `Success: Made 0 replacements." does eventually replace all occurrences.

The string I'm replacing is the site hostname, so a large part of the rows are affected.

The command I'm running is essentially wp db search-replace --all-tables --report-changed-only www.customer.com customer.test. I've tried all variations of the command that I could come up with (including with and without --precise), but it makes no difference.

Here's the size of the DB:

wp_wp_quiz_play_data    7599
wp_users    33
wp_redirection_logs 5711
wp_redirection_404  4336
wp_posts    59567
wp_yoast_migrations 21
wp_options  2088
wp_yoast_indexable_hierarchy    6581
wp_simple_history_contexts  416608
wp_termmeta 368
wp_yoast_seo_meta   24576
wp_term_relationships   36248
wp_usermeta 1191
wp_yoast_indexable  33768
wp_yoast_primary_term   733
wp_redirection_groups   11
wp_simple_history   91696
wp_term_taxonomy    593
wp_yoast_seo_links  10145
wp_terms    593
wp_comments 427
wp_redirection_items    2967
wp_postmeta 1857370
wp_links    0
wp_commentmeta  0

On the 1st run I get:

+----------------------------+-----------------------+--------------+------+
| Table                      | Column                | Replacements | Type |
+----------------------------+-----------------------+--------------+------+
| wp_comments                | comment_author_url    | 251          | SQL  |
| wp_comments                | comment_content       | 3            | SQL  |
| wp_options                 | option_value          | 7            | PHP  |
| wp_postmeta                | meta_value            | 18590        | PHP  |
| wp_posts                   | post_content          | 4787         | PHP  |
| wp_posts                   | pinged                | 117          | SQL  |
| wp_posts                   | guid                  | 51191        | SQL  |
| wp_redirection_404         | url                   | 22           | SQL  |
| wp_redirection_404         | domain                | 4462         | SQL  |
| wp_redirection_404         | referrer              | 282          | SQL  |
| wp_redirection_items       | url                   | 2            | SQL  |
| wp_redirection_items       | match_url             | 2            | SQL  |
| wp_redirection_items       | action_data           | 327          | PHP  |
| wp_redirection_logs        | url                   | 1            | SQL  |
| wp_redirection_logs        | domain                | 5967         | SQL  |
| wp_redirection_logs        | sent_to               | 435          | SQL  |
| wp_redirection_logs        | referrer              | 473          | SQL  |
| wp_simple_history_contexts | value                 | 4370         | SQL  |
| wp_term_taxonomy           | description           | 4            | PHP  |
| wp_termmeta                | meta_value            | 2            | SQL  |
| wp_usermeta                | meta_value            | 8            | PHP  |
| wp_wp_quiz_play_data       | quiz_data             | 7999         | SQL  |
| wp_yoast_indexable         | permalink             | 33757        | SQL  |
| wp_yoast_indexable         | canonical             | 60           | SQL  |
| wp_yoast_indexable         | twitter_image         | 11595        | SQL  |
| wp_yoast_indexable         | open_graph_image      | 10761        | SQL  |
| wp_yoast_indexable         | open_graph_image_meta | 7308         | SQL  |
| wp_yoast_seo_links         | url                   | 5599         | SQL  |
+----------------------------+-----------------------+--------------+------+

2nd:

+-------------+--------------+--------------+------+
| Table       | Column       | Replacements | Type |
+-------------+--------------+--------------+------+
| wp_postmeta | meta_value   | 9000         | PHP  |
| wp_posts    | post_content | 2000         | PHP  |
+-------------+--------------+--------------+------+

3rd:

+-------------+--------------+--------------+------+
| Table       | Column       | Replacements | Type |
+-------------+--------------+--------------+------+
| wp_postmeta | meta_value   | 5000         | PHP  |
| wp_posts    | post_content | 1000         | PHP  |
+-------------+--------------+--------------+------+

4th:

+-------------+--------------+--------------+------+
| Table       | Column       | Replacements | Type |
+-------------+--------------+--------------+------+
| wp_postmeta | meta_value   | 2000         | PHP  |
| wp_posts    | post_content | 1000         | PHP  |
+-------------+--------------+--------------+------+

5th, 6th:

+-------------+------------+--------------+------+
| Table       | Column     | Replacements | Type |
+-------------+------------+--------------+------+
| wp_postmeta | meta_value | 1000         | PHP  |
+-------------+------------+--------------+------+

And the 7th just Success: Made 0 replacements..

Describe how other contributors can replicate this bug

Right now I don't have any other way of reproducing the issue except using my customer's DB dump. If needed, I'll get a repo together where it can be reproduced. Is it possible to create large DBs inside a test feature in the repo?

Describe what you would expect as the correct outcome

All replacements should be done in a single run.

Let us know what environment you are running this on

OS: Linux 5.14.8-arch1-1 #1 SMP PREEMPT Sun, 26 Sep 2021 19:36:15 +0000 x86_64
Shell:  
PHP binary: /usr/local/bin/php
PHP version:    7.4.16
php.ini used:   /usr/local/etc/php/php.ini
MySQL binary:   /usr/bin/mysql
MySQL version:  mysql  Ver 15.1 Distrib 10.3.27-MariaDB, for debian-linux-gnu (x86_64) using readline 5.2
SQL modes:  
WP-CLI root dir:    /srv/app/vendor/wp-cli/wp-cli
WP-CLI vendor dir:  /srv/app/vendor
WP_CLI phar path:   
WP-CLI packages dir:    
WP-CLI global config:   
WP-CLI project config:  /srv/app/wp-cli.yml
WP-CLI version: 2.5.0
remyvv commented 2 years ago

Related: https://github.com/wp-cli/wp-cli/issues/5573

cjhaas commented 2 years ago

We are running into this too, lately. I have two servers, one is Ubuntu 20.04, the other is 21.04, and when I run a standard search-replace across the exact same database, the older server completes in one pass but the newer one takes 7 or so passes.

The older server's info is:

OS:     Linux 5.4.0-89-generic #100-Ubuntu SMP Fri Sep 24 14:50:10 UTC 2021 x86_64
Shell:  /bin/bash
PHP binary:     /usr/bin/php7.4
PHP version:    7.4.25
php.ini used:   /etc/php/7.4/cli/php.ini
MySQL binary:   /usr/bin/mysql
MySQL version:  mysql  Ver 15.1 Distrib 10.4.22-MariaDB, for debian-linux-gnu (x86_64) using readline 5.2
SQL modes:
WP-CLI root dir:        phar://wp-cli.phar/vendor/wp-cli/wp-cli
WP-CLI vendor dir:      phar://wp-cli.phar/vendor
WP_CLI phar path:       /var/www/
WP-CLI packages dir:
WP-CLI global config:
WP-CLI project config:
WP-CLI version: 2.5.0-alpha-9061b3e

The newer server's info is:

OS:     Linux 5.4.0-90-generic #101-Ubuntu SMP Fri Oct 15 20:00:55 UTC 2021 x86_64
Shell:  /bin/bash
PHP binary:     /usr/bin/php7.4
PHP version:    7.4.25
php.ini used:   /etc/php/7.4/cli/php.ini
MySQL binary:   /usr/bin/mysql
MySQL version:  mysql  Ver 15.1 Distrib 10.6.5-MariaDB, for debian-linux-gnu (x86_64) using readline 5.2
SQL modes:
WP-CLI root dir:        phar://wp-cli.phar/vendor/wp-cli/wp-cli
WP-CLI vendor dir:      phar://wp-cli.phar/vendor
WP_CLI phar path:       /var/www/
WP-CLI packages dir:
WP-CLI global config:
WP-CLI project config:
WP-CLI version: 2.5.0

I'll try to re-run this later today or tomorrow to get exact numbers, but towards the end it finds exactly 6,000, then exactly 3,000, and then I think it is 1,000 and then 0, but I'll update when I know more.

wojsmol commented 2 years ago

@cjhaas Please use WP-CLI 2.5.0 stable on older server and test again.

cjhaas commented 2 years ago

@wojsmol , unfortunately do to time constraints I was only able to do the opposite, which is run alpha on the new server, and that worked as expected, all replacements ran in the first pass.

If I get a chance this weekend, I'll see if I can debug this further

wojsmol commented 2 years ago

@cjhaas Did you run the same phar file or fresh nightly? Additio9nally please check innodb_default_row_format on new server.

cjhaas commented 2 years ago

@wojsmol , I copied the alpha phar from my older server, so it is identical. Both servers have a value of dynamic for that variable.

For MySQL, they are both stock unmodified installs using:

curl -sS https://downloads.mariadb.com/MariaDB/mariadb_repo_setup | sudo bash
apt install mariadb-server-10.6

(replacing the version appropriately)

clockworkroger commented 2 years ago

You might be able to potentially rule out the InnoDB settings — I just hit this issue earlier today when trying to update a database, so I figured I'd try using the --precise option (which I believe is supposed to do everything with PHP instead of sometimes with SQL queries, right?), but it still took multiple attempts. Same thing too, some tables were stopping at multiples of 1000, but not in an order that makes sense. Had a postmeta table do 3000, then 1814, then 1000.

I was running it locally — WP-CLI 2.5.0, MariaDB 10.6.4, and PHP 8.0.12 installed with Homebrew on latest macOS.

cjhaas commented 2 years ago

Okay, this appears to be fixed in #162 actually

However, the replacement is happening within each chunk, which breaks the $offset handling, causing every second chunk to be missed. This makes it necessary to run a query multiple times to ensure all occurrences are replaced.

This PR defers the actual updates until after the full search. This is still worse in terms of memory management that completely isolated chunking, but it should still be an overall improvement while making the replacements in a reliable way.

wojsmol commented 2 years ago

@cjhaas You can test if #162 fixes the issue by running your test with latest nightly available here or by running wp cli update --nightly

bradp-wordkeeper commented 2 years ago

Just thought I would reply since I have a site with the same issue. I just tested nightly (2.5.1-alpha-76fabae) and it fixed the problem for me. :)

schlessera commented 2 years ago

Fixed with v2.6.0 release.