Yoast / wordpress-seo

Yoast SEO for WordPress
https://yoast.com/wordpress/plugins/seo/
Other
1.75k stars 885 forks source link

Update to 17.9 (at the latest) breaks DB layout (Row size too large) #17988

Open tgoeg opened 2 years ago

tgoeg commented 2 years ago

Please give us a description of what happened.

Updated to 17.9, got the following in the web server's error log (then unnoticed):

[Wed Jan 19 14:46:10.144352 2022] [php7:notice] [pid 623965] [client
1.1.1.1:32807] WordPress database error Row size too large. The maximum row
size for the used table type, not counting BLOBs, is 8126. This includes
storage overhead, check the manual. You have to change some columns to TEXT or
BLOBs for query ALTER TABLE `wpx_yoast_indexable` ADD `object_last_modified`
datetime von require('wp-blog-header.php'), require_once('wp-load.php'),
require_once('wp-config.php'), require_once('wp-settings.php'),
include_once('/plugins/wordpress-seo/wp-seo.php'),
require_once('/plugins/wordpress-seo/wp-seo-main.php'), YoastSEO,
Yoast\\WP\\Lib\\Abstract_Main->load, Yoast\\WP\\SEO\\Loader->load,
Yoast\\WP\\SEO\\Loader->load_initializers,
Yoast\\WP\\SEO\\Initializers\\Migration_Runner->initialize,
Yoast\\WP\\SEO\\Initializers\\Migration_Runner->run_free_migrations,
Yoast\\WP\\SEO\\Initializers\\Migration_Runner->run_migrations,
Yoast\\WP\\SEO\\Initializers\\Migration_Runner->run_migration,
Yoast\\WP\\SEO\\Config\\Migrations\\AddObjectTimestamps->up,
Yoast\\WP\\Lib\\Migrations\\Migration->add_column,
Yoast\\WP\\Lib\\Migrations\\Adapter->add_column,
Yoast\\WP\\Lib\\Migrations\\Adapter->execute_ddl,
Yoast\\WP\\Lib\\Migrations\\Adapter->query

This is on a innodb_file_per_table setup on a current mariaDB/mySQL with row_format=dynamic. So basically maxing out what is technically possible as a max row size.

Since then, WordPress is extremely slow because of repeated log entries like this:

[Mon Jan 24 00:00:47.473268 2022] [php7:notice] [pid 318544] [client
1.2.3.4:17407] WordPress database error Unknown column 'object_published_at' in
'field list' for query UPDATE `wpx_yoast_indexable` SET `object_id` = '1',
`object_type` = 'term', `object_sub_type` = 'category', `permalink` =
'https://domain.com/category/news/', `blog_id` = '1',
`primary_focus_keyword_score` = '18', `is_robots_noindex` = NULL, `is_public` =
NULL, `open_graph_image` = NULL, `open_graph_image_id` = NULL,
`open_graph_image_source` = NULL, `open_graph_image_meta` = NULL,
`twitter_image` = NULL, `twitter_image_id` = NULL, `twitter_image_source` =
NULL, `canonical` = NULL, `primary_focus_keyword` = 'Global', `title` = NULL,
`description` = NULL, `readability_score` = '0', `breadcrumb_title` = 'Global',
`open_graph_title` = NULL, `open_graph_description` = NULL, `twitter_title` =
NULL, `twitter_description` = NULL, `is_cornerstone` = '0',
`is_robots_nofollow` = NULL, `is_robots_noarchive` = NULL,
`is_robots_noimageindex` = NULL, `is_robots_nosnippet` = NULL,
`object_published_at` = '2016-03-11 06:04:21', `object_last_modified` =
'2022-01-18 14:46:43', `version` = '2', `has_ancestors` = '', `permalink_hash`
= '36:59f30d20708642aef27eddb9962f8d0c', `updated_at` = '2022-01-23 23:00:47'
WHERE `id` = '40' in require('wp-blog-header.php'),
require_once('wp-includes/template-loader.php'), [..]

Tried to use wp yoast index to no avail (expected, another column is missing):

for query UPDATE `wpx_yoast_indexable` SET `object_id` = '8', `object_type` =
'user', `permalink` = 'https://domain.com/author/username/', `title` = NULL,
`description` = NULL, `is_cornerstone` = '0', `is_robots_noindex` = '0',
`is_robots_nofollow` = NULL, `is_robots_noarchive` = NULL,
`is_robots_noimageindex` = NULL, `is_robots_nosnippet` = NULL, `is_public` =
NULL, `has_public_posts` = NULL, `blog_id` = '1', `open_graph_image` =
'https://0.gravatar.com/avatar/91082333ec444a5a01574f4085504dcd?s=500&d=mm&r=g',
`open_graph_image_id` = NULL, `open_graph_image_source` = 'gravatar-image',
`open_graph_image_meta` = NULL, `twitter_image` =
'https://0.gravatar.com/avatar/91082333ec444a5a01574f4085504dcd?s=500&d=mm&r=g',
`twitter_image_id` = NULL, `twitter_image_source` = 'gravatar-image',
`object_published_at` = '2019-08-25 07:00:54', `object_last_modified` =
'2022-01-24 09:54:23', `version` = '2', `permalink_hash` =
'36:14909ffdf0450e722557cfc7fa0f8a3b', `updated_at` = '2022-01-24 12:25:13'
WHERE `id` = '7' in include('phar://wp/php/boot-phar.php'),
include('phar://wp/vendor/wp-cli/wp-cli/php/wp-cli.php'), WP_CLI\bootstrap,
WP_CLI\Bootstrap\LaunchRunner->process, WP_CLI\Runner->start,
WP_CLI\Runner->run_command_and_exit, WP_CLI\Runner->run_command,
WP_CLI\Dispatcher\Subcommand->invoke, call_user_func,
WP_CLI\Dispatcher\CommandFactory::WP_CLI\Dispatcher\{closure}, call_user_func,
Yoast\WP\SEO\Commands\Index_Command->index,
Yoast\WP\SEO\Commands\Index_Command->run_indexation_actions,
Yoast\WP\SEO\Commands\Index_Command->run_indexation_action,
Yoast\WP\SEO\Actions\Indexing\Indexable_Post_Indexation_Action->index,
Yoast\WP\SEO\Repositories\Indexable_Repository->find_by_id_and_type,
Yoast\WP\SEO\Repositories\Indexable_Repository->upgrade_indexable,
Yoast\WP\SEO\Builders\Indexable_Builder->build,
Yoast\WP\SEO\Repositories\Indexable_Repository->find_by_id_and_type,
Yoast\WP\SEO\Repositories\Indexable_Repository->upgrade_indexable,
Yoast\WP\SEO\Builders\Indexable_Builder->build,
Yoast\WP\SEO\Builders\Indexable_Builder->save_indexable,
Yoast\WP\SEO\Models\Indexable->save, Yoast\WP\Lib\Model->save,
Yoast\WP\Lib\ORM->save, Yoast\WP\Lib\ORM::execute

Tried to use yoast-test-helper's Reset Indexables tables & migrations functionality to no avail:

[Mon Jan 24 15:46:57.983881 2022] [php7:notice] [pid 399179] [client
1.3.1.2:38840] WordPress database error Table
'mysite.wpx_yoast_prominent_words' doesn't exist for query TRUNCATE TABLE
wpx_yoast_prominent_words made by
do_action('admin_post_wordpress-seo-feature-reset'), WP_Hook->do_action,
WP_Hook->apply_filters,
Yoast\\WP\\Test_Helper\\WordPress_Plugin_Features->handle_reset_feature,
Yoast\\WP\\Test_Helper\\WordPress_Plugin_Features->reset_feature,
Yoast\\WP\\Test_Helper\\WordPress_Plugins\\Yoast_SEO->reset_feature,
Yoast\\WP\\Test_Helper\\WordPress_Plugins\\Yoast_SEO->reset_indexables,
Yoast\\WP\\Test_Helper\\WordPress_Plugins\\Yoast_SEO->reset_prominent_words_calculation,
referer: https://domain.com/wp-admin/tools.php?page=yoast-test-helper

This is the currently longest row in the affected table:

id,
permalink,
permalink_hash, object_id, object_type, object_sub_type, author_id, post_parent, title,
description,
breadcrumb_title, post_status,
is_public, is_protected, has_public_posts, number_of_pages, canonical, primary_focus_keyword, primary_focus_keyword_score, readability_score, is_cornerstone, is_robots_noindex, is_robots_nofollow, is_robots_noarchive, is_robots_noimageindex,
is_robots_nosnippet, twitter_title, twitter_image,twitter_description,
twitter_image_id, twitter_image_source, open_graph_title, open_graph_description,
open_graph_image, open_graph_image_id,
open_graph_image_source,
open_graph_image_meta,
link_count, incoming_link_count, prominent_words_version, created_at, updated_at, blog_id, language, region,
schema_page_type, schema_article_type, has_ancestors, estimated_reading_time_minutes, version

'2653',
'https://domain.com/some-redacted-long-stringsomeredactedlongstringsomeredactedlong/',
'81:cf47c97bafeac521749566e5c4bff289', '7423', 'post', 'post', '8', '0', NULL,
'someredactedlongstringsomeredactedlongstringsomeredactedlongstr
someredactedlongstringsomeredactedlongstringsomeredactedlongstr
someredactedlongstringsomeredactedlongstringsomeredactedlongstr
someredactedlongstringsomeredactedlongstringsomeredactedlongstr
someredactedlongstringsomeredactedlongstringsomeredactedlongstr
someredactedlongstrings',
'someredactedlongstringsomeredactedlongstringsomeredactedlongstr', 'publish',
NULL, '0', NULL, NULL, NULL, NULL, NULL, '30', '0', NULL, '0', NULL, NULL,
NULL, NULL, 'https://domain.com/wp-content/uploads/2020/09/img_1.jpg', NULL,
'7426', 'featured-image', NULL, NULL,
'https://domain.com/wp-content/uploads/2020/09/img_1.jpg', '7426',
'featured-image',
'{\"width\":620,\"height\":311,\"url\":\"https://domain.com/wp-content/uploads/2020/09/img_1.jpg\",\"path\":\"/som/edir/domain.com/wp-content/uploads/2020/09/img_1.jpg\",\"size\":\"full\",\"id\":7426,\"alt\":\"someredactedlongstringsomeredactedlongstringsomeredactedlongstr
someredactedlongstringsomeredactedlongstringsomeredactedlongstr
someredactedlongstringsomeredactedlongstringsomeredactedlongstr
someredactedlongstringsomeredactedlongstringsomeredactedlongstr
someredactedlongstringsomeredactedlongstringsomeredactedlongstr
someredactedlongstringssome-redactedlongstring
someredactedlongstring-redactedlongstring
someredactedlongstring-redactedlongstring
someredactedlongstring-redactedlongstring
someredactedlongstring-redactedlongstring
someredactedlongstring-redactedlongstring
someredactedlongstring-redactedlongstring
someredactedlongstring-redactedlongstring
someredactedlongstringstringstri\",\"pixels\":192820,\"type\":\"image/jpeg\"}',
'0', '5', NULL, '2020-09-18 09:48:55', '2021-12-10 09:04:36', '1', NULL, NULL,
NULL, NULL, '0', NULL, '1'

See how the entry violates 1NF and 3NF of DB normalization. https://domain.com/wp-content/uploads/2020/09/img_1.jpg alone is redundantly contained 3(!) times (and a fourth time with and absolute local path).

I am no full-time DBA and just a sysadmin, but I would have expected this opengraph and twitter data to be in a separate table (only once), and only referenced here. I don't know the intricacies of a WP DB, but also the object_id, object_type, object_sub_type columns would be a perfect fit for a separate table. Also is_public, is_protected, has_public_posts seems like something that could be joined in from WP core's tables? Structures like these are the most dominant cause for "row size too large" (and bad performance) when researching for the named error. This may very well be related to other tickets I saw in the issue queue describing high DB load. And to #17843, #17688. And to some others mentioning missing columns.

However, in this one I really drilled down to the root cause it seems.

Please describe what you expected to happen and why.

  1. Updates need to correctly handle errors in DB operations and roll back so no-one can get stuck in a situation like this. And it would prevent all those subsequent errors.
  2. DB layout has to be normalized to reduce redundant data and guarantee data integrity. This is screaming for subsequent errors all over the place.
  3. (Just as an aside: Enforce row_format=dynamic in the CREATE statements, on setups with row_format=compressed, this limit will be hit even earlier.)
  4. Some kind of way has to be found to restructure the current data with the missing columns and trim down the table to an acceptable size, so that
  5. Page is responsive again :-) And works without spilling the error log.

How can we reproduce this behavior?

I guess

  1. Install a version of Yoast SEO before the additional columns have been added (sorry I don't have the version that was installed before)
  2. Fill those tables with long example data.
  3. Try to update Yoast SEO and trigger its ALTER TABLE stanzas that add the columns.
  4. Run into the row size limit.

Technical info

Editors/Browser versions not relevant.

Used versions

Djennez commented 2 years ago

@tgoeg thanks for the report!

@herregroen @diedexx I haven't fully read every line in this report yet, but I think this warrants your attention.

tgoeg commented 2 years ago

Any update on this?

todeveni commented 2 years ago

Any update on this?

todeveni commented 1 year ago

Anniversary is coming up soon 🎂 Any update on this?