Yoast / wordpress-seo

Yoast SEO for WordPress
https://yoast.com/wordpress/plugins/seo/
Other
1.75k stars 885 forks source link

Performance issues link_suggestions classic editor large site #15869

Closed stayallive closed 2 years ago

stayallive commented 4 years ago

Please give us a description of what happened.

When editting a post (using the classic editor) we can see that every pause in typing a request is fired of to the link_suggestions API endpoint. On our site these requests take anywhere from 15 seconds to 3 minutes and up. This means that a lot of requests are created when typing and slowing those requests even more down (because server constraints) to the point they start to time out.

image

We are hosted on https://servebolt.com/ and have been told the resources for the database or PHP are not the bottleneck.

Please describe what you expected to happen and why.

I expected 1 maybe 2 link_suggestions API requests per editor instance at most and the performance of those endpoints to be quite good.

How can we reproduce this behavior?

  1. Open WordPress classic editor
  2. Type some word like foo
  3. Wat a bit for the link_suggestions API request
  4. Type some more words like foo
  5. Repeat until many link_suggestions API requests are observed

Technical info

Used versions


If the link_suggestions requests are indeed not supposed to be super fast the only issue here might be that there are many of them active at the same time. If it should be fast I'd like some pointers on how to profile/debug to find the root cause of that since I have no clue where to start digging.

jphorn commented 3 years ago

Yoast SEO 16.8 was released today. Is this the ONE? @Djennez

Any preliminary reports from others?

16.8

Release Date: July 27th, 2021

Yoast SEO 16.8 is out today! This release comes with an updated readability analysis with support for two new languages: Norwegian and Slovak. Did you know that Yoast SEO is nearing language support for twenty languages? Read more about what’s new in Yoast SEO 16.8 in our release post!

Enhancements:

Completes the readability analysis for Slovak by adding the transition words, sentence beginnings and passive voice assessments. Improves keyphrase recognition in Slovak by filtering out function words such as som, a, jedna, že. Completes the readability analysis for Norwegian by adding the transition words, sentence beginnings and passive voice assessments. Improves keyphrase recognition in Norwegian by expanding the list of function words that are filtered out. Adds the first two steps of the Premium cornerstone workout. Throws a notification in the plugins page to users who have an expired subscription. Improves the performance of background requests (admin-ajax calls).

Bugfixes:

Fixes a bug where paginated static frontpages would fail to output a valid breadcrumb. Fixes a bug where the image selectors in the search appearance and social settings did not have a screen reader text.

Djennez commented 3 years ago

Improves the performance of background requests (admin-ajax calls).

This is in regards to: https://github.com/Yoast/wordpress-seo/issues/16812

Next release (16.9) will contain a lot of performance related fixes.

jphorn commented 3 years ago

Thanks, but not the answer I was hoping for. I'm really disappointed this has been open for so long. We can't use much of Yoasts unique features because of very poor performance management. Really sad. I really hope there will be more focus on performance for large sites (an important clients base of Yoast I reckon) in newer releases instead of yet another language related bug fix.

paulocoghi commented 3 years ago

@Djennez , I would cordially ask to review my suggestion on the PR 17187, here https://github.com/Yoast/wordpress-seo/pull/17187#issuecomment-887563953

Djennez commented 3 years ago

Hi all, small update. Our developers have been working on improving the overall performance of the plugin by changing and optimizing database queries. Today's release (16.9) contains several of these fixes. Unfortunately, despite testing several changes, we were not yet able to find a way to significantly improve performance on the issue from this thread (without compromising functionality / stability). This will still be researched in the next 2 versions of the plugin, and we hope to improve performance by 17.1.

paulocoghi commented 3 years ago

We are glad to know the Yoast Team is aware of the issue and it's working on it :smiley:

jphorn commented 3 years ago

Appreciate the honesty, feedback and roadmap.

mossifer commented 3 years ago

Thank you for working on it and for the transparency.

ArrayIterator commented 3 years ago

Thanks for hard working 🙏

Even though still being faced the issued on latest update.

I still found that indexable & stem that causing the issues.

When articles/prominent words growing up till millions data, it will slowing up the distinct / count(stem) process.

But, when I cut / split the where clause of array (WHERE IN) to 100 words Max, then make custom loop on php script, the cpu usage drop significantly.

And for temporary problem solving, I create custom cronjob to reduce prominent words by time range.

ArrayIterator commented 3 years ago

Please explain the php script. :) Best Regards, Todd Batrynchuk Founder and CEO @. LinuxGameConsortium.com http://www.linuxgameconsortium.com/ http://LinuxGameConsortium.com http://linuxgamenews.com/ Discord: [Todd B] LinuxGameConsortium#3555 https://discord.gg/dTrZ78c Twitter: @linuxgamecons https://twitter.com/linuxgamecons YouTube: LinuxGameConsortium https://www.youtube.com/user/linuxgameconsortium LinkedIn:* todd-batrynchuk https://www.linkedin.com/company/linux-game-consortium On Sat, Aug 14, 2021 at 12:56 PM ArrayIterator @.***> wrote: Thanks for hard working 🙏 Even though still being faced the issued on latest update. I still found that indexable & stem that causing the issues. When articles/prominent words growing up till millions data, it will slowing up the distinct / count(stem) process. But, when I cut / split the where clause of array (WHERE IN) to 100 words Max, then make custom loop on php script, the cpu usage drop significantly. And for temporary problem solving, I create custom cronjob to reduce prominent words by time range. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#15869 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA5EXSZINZK7JD4MOULZEJDT42N3NANCNFSM4QBLPIPA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&utm_campaign=notification-email .

NOTE: WPSEO PREMIUM

The problem caused on wp-content/plugins/{wp_seopremium_plugin_dir}/src/repositories/prominent-words-repository.php

Results of print_r($prominent_stems) - 2K Prominents Words being include where in statements

image

Query Executed on Process List MYSQL

image

The Code that CPU Hungry

image

After reduce prominent lists (chunk split the array with looping) to 300 words max

image

Before Splitting Prominents Words by Ids (Old Image - 2 weeks ago - That make our team add additional CPU threads )

image

I used to split the array into lower count and them looping it to php & merge it, even though also reduce performance.

You can Imagine, when WHERE IN statements contains thousands words array in millions of data?!

NOTE: I Only Test with SEO PREMIUM Plugins

Try to access wp-admin/admin.php?page=wpseo_workouts page to test the prominents query, the nightmare exists when index lists contains millions data & already indexed, page just like endlessly.

But when index listst truncated or empty or just contains thousands data, it will blazing fast!!

I think this problem caused by multiple query that execute stem count & select cause on Yoast Prominents Words.

Maybe the Yoast team does not tested with hundreds thousands article that contains millions prominent words also does not check / test the resource used by codes.

ArrayIterator commented 3 years ago

And now, to keep use yoast SEO Premium, Our team periodially empty/truncate data of the yoast prominents & indexable data.

paulocoghi commented 3 years ago

@Djennez , the suggestion from @ArrayIterator seems interesting.

Djennez commented 3 years ago

It does, so passed it along to the development team this morning and they were able to run a few tests based on the concept. Unfortunately, we were unable to get any performance improvement by splitting the query in chunks. In fact, performance worsened to different degrees, based on the chunk size.

@ArrayIterator I'm not sure if I read your comment right, but do you only see performance improvement after truncating prominent words / indexables? Or does your chunk method also work on the large datasets? Can you share additional details about your code changes?

ArrayIterator commented 3 years ago

It does, so passed it along to the development team this morning and they were able to run a few tests based on the concept. Unfortunately, we were unable to get any performance improvement by splitting the query in chunks. In fact, performance worsened to different degrees, based on the chunk size.

@ArrayIterator I'm not sure if I read your comment right, but do you only see performance improvement after truncating prominent words / indexables? Or does your chunk method also work on the large datasets? Can you share additional details about your code changes?

  1. Performance improved when prominents/indexable truncated /empty
  2. Chunk Words just prevent MySQL eat many resource, but it will increase response time cause looping & multiple query. This not improve performance, but .. only reduce cpu usage.

2.a. Truncated prominent words make site fastest than before, cause no huge datasets anymore.

2.b. Chunk split words no more than 300 words in WHERE IN statements just reduce CPU Usage, but performance will be degraded (response time increased) AND YES PERFORMANCE WILL BE SLOW DOWN BUT REDUCE CPU USAGE.

Cause the biggest problem is : Adding thousands (no limit) word list on WHERE IN Statement prominent words query - Make MySQL CPU HUNGRY, and then … server will be unstable and make site down.

(For Splitting Array - Please see the image)

Method find_by_list_of_ids(), I used to :

foreach (array_chunk($prominent_stems, 300) as $stem_list) {
// do code like document freq and merge the $stem_counts data
}

I try to check on when open worksout page, and check MySQL Processlist , yoast try to check many words on dataset (prominents words database) and check that many query executed that caused of multiple call on count stem.

So .. on no. 2 (chunk split) even though reduce my server cpu usage, it will make page just like endlessly / infinite time, cause increasing response time that caused by more queries executed inside loop. (Even not implement chunk the words list, the page still got very long long time to finished).

BUT … When indexable / prominents words has been truncated / empty, the performance increased significantly.

I think, the main case is about :

Queries & Getting Dataset Logic from Huge records of prominents words.

Reduce query & methodology about how data / prominent words served on yoast plugin maybe help.

ArrayIterator commented 3 years ago

Reference

Result of print_r($prominent_stems); word list on where in statement that make MySQL CPU Hungry.

paulocoghi commented 3 years ago

@Djennez , minutes ago I updated Yoast SEO Pro from 16.5 to 17.0.

After it, our database CPU usage increased almost 10x, from 300%-400% to 3000%-4000%

paulocoghi commented 3 years ago

Update

After choosing the option to optimize the SEO data, the database CPU processing is now between 300% and 500% Everything seems normal again.

The reduction on processing was momentary. Now it's grown to 3000%-4000% again.

coolrecep commented 3 years ago

17.1 released today. Does it fix the issue?

paulocoghi commented 3 years ago

@coolrecep I will try it on a less busy time to avoid impacting the editors/journalists, in the case the issue isn't fixed, and I will post the results here.

nimmolo commented 3 years ago

Yo - what is going on with this? Thanks to @ArrayIterator and @paulocoghi for doing so much detective work.

I feel like the problem has been adequately documented for quite a long time.

Djennez commented 3 years ago

I believe some improvements to indexables were made over the past few versions. Though I have not been able to stay up to date for the last 2 weeks (and also will not be for the next week). Any additional information is welcome when testing with the latest version and we'll be able to take that into consideration.

PS: Please test the updates for yourself if you want to be sure, as the people experiencing these issues may not actually be running into the same underlying problem. So one fix may work for one person, and not for the other.

jphorn commented 2 years ago

Did anyone already test Yoast SEO 17.2 on a production site?

paulocoghi commented 2 years ago

Sorry for the late reply. Our customer (Yoast SEO Pro subscriber) is concerned of updating and eventually getting downtime (again).

It's a news portal with a monthly average of 20-30 million sessions and 300-400 million impressions, and we would have to clone the whole server in order to test Yoast.

After the downtime caused, our customer decided to completely disabled the link suggestion feature. :(

ArrayIterator commented 2 years ago

Any news? After Current Update, 32 Threads Gonna 99% 😓

When I deactivated WPSEO Premium (Yoast Plugin) CPU goes normal again.

More than a year has not been resolved? 😓

Maybe team can add feature to Totally disable words suggestion / Indexable. To prevent crashing CPU server

paulocoghi commented 2 years ago

IMHO, the suggestions previously made by @stayallive are the way to go, on every Yoast feature which provides data on user typing:

paulocoghi commented 2 years ago

Currently, when the author types, Yoast initiate dozens of parallel queries when only the last one will be in fact read and used.

This creates a great amount of unnecessary processing and execution time, which is clearly visible on sites with a large number of posts.

jphorn commented 2 years ago

Has this ever been fixed? We're still on Yoast SEO 16.3 :-(

JoeyBol commented 2 years ago

We are running version 18.0 and we are still dealing with high CPU loads so I don't think it has been fixed.

paulocoghi commented 2 years ago

It's been 18 months now. My definitive suggestion to the Yoast Pro team:

paulocoghi commented 2 years ago

As I said earlier, as a paying customer of Yoast Pro, it crashed a server with 48 (Epyc) CPUs and 128GB of RAM, of which 64GB are dedicated to a highly-tuned MariaDB installation.

When the mentioned Yoast Pro features are disabled, the same server can handle dozens of millions of access per day with little effort and zero CPU spikes.

When the same Yoast features are enabled, they easily crash our database and we don't want to create a MariaDB cluster only to accomodate the abnormal load those features generate.

We cordially cannot afford anymore to use our production servers and customers as a playground/laboratory to identify and diagnose such issues, the most part related to the paid (Pro) version of Yoast.

Unfortunately our customers are angry but, at the same time, they would like to continue to use Yoast Pro as they really liked its features (that are currently disabled).

I believe my previous suggestions are a feasible way to simulate and diagnose the same issues we all are having and extensively posting here.

Thanks a lot for your comprehension and effort.

jphorn commented 2 years ago

@Djennez When can we expect a proper follow-up from the tests as outlined by @paulocoghi ? And also the suggestions as done by @stayallive? Why is no one assigned to this issue or has no milestone been defined?

mmikhan commented 2 years ago

Internally opened: https://yoast.atlassian.net/browse/IM-1713

Djennez commented 2 years ago

This issue has been put on our internal backlog for the devs to reinvestigate.

jphorn commented 2 years ago

Any news? It's been another month. Any updates on the internal backlog issue? Has a milestone been defined?

mmikhan commented 2 years ago

A PR has been merged today in Yoast SEO Premium to minimise the HTTP requests to generate the internal linking suggestions. We are expecting the issue to be fixed with the relevant PR and as such, closing the issue. If the issue persists after using the relevant release, do feel free to let us know.