Yoast / wordpress-seo

Yoast SEO for WordPress
https://yoast.com/wordpress/plugins/seo/
Other
1.77k stars 892 forks source link

Page or post sitemap rendering is taking too much time #12304

Open stodorovic opened 5 years ago

stodorovic commented 5 years ago

Please give us a description of what happened.

Similar issue as #12302. If filter wpseo_sitemap_content_before_parse_html_images is used for executing all shortcodes on all pages/posts then it's possible that max_execution_time could be reached.

Please describe what you expected to happen and why.

Faster loading of page/post sitemaps and without "500 Internal Server Error". At least, it's possible to show notice about it or add option to disable image parser.

How can we reproduce this behavior?

  1. Create a lot of posts (or pages) with a few complex shortcodes (in each post/page).
  2. Install Divi theme or add following code:
    add_filter( 'wpseo_sitemap_content_before_parse_html_images', function( $content ) {
    return apply_filters( 'the_content', $content );
    } );
  3. Check loading time of post (or page) sitemap.

Technical info

There are similar issues on support forum. Latest issue is Posts-Sitemap.xml Locked / Cannot be opened Google GWT Error

Used versions

stodorovic commented 5 years ago

Example (after user has increased max_execution_time) - there are about 500 posts and 13 pages:

$ curl -i https://xxxxxxxxxx.com/index.php?sitemap=post
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100    63    0    63    0     0      1      0 --:--:--  0:00:34 --:--:--    12
HTTP/1.1 200 OK
Date: Tue, 26 Feb 2019 10:17:51 GMT
Content-Type: text/xml; charset=UTF-8
Transfer-Encoding: chunked
...
$ curl -i https://xxxxxxxxxx.com/index.php?sitemap=page
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 36873    0 36873    0     0   8601      0 --:--:--  0:00:04 --:--:--  8635
HTTP/1.1 200 OK
Date: Tue, 26 Feb 2019 10:23:41 GMT
Content-Type: text/xml; charset=UTF-8
Transfer-Encoding: chunked
...

It's big difference between loading page-sitemap(4s) and post-sitemap (34s).

stodorovic commented 5 years ago

After little changes (from the Elegant Themes Developers) in .../includes/builder/plugin-compat/wordpress-seo.php, it's little better:

$ curl -i https://xxxxxxxxxx.com/index.php?sitemap=post
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100     0    0     0    0     0      1      0 --:--:--  0:00:28 --:--:--    12
HTTP/1.1 200 OK
Date: Wed, 06 Mar 2019 14:16:31 GMT
Content-Type: text/xml; charset=UTF-8
Transfer-Encoding: chunked
...

After user has added following code (which prevents triggering _'thecontent' filter):

add_filter( 'wpseo_sitemap_content_before_parse_html_images', '__return_empty_string', 9 );

There is huge difference (4 seconds instead of 28 seconds):

$ curl -i https://xxxxxxxxxx.com/index.php?sitemap=post
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 64547    0 64547    0     0  15553      0 --:--:--  0:00:04 --:--:-- 15613
HTTP/1.1 200 OK
Date: Wed, 06 Mar 2019 14:16:31 GMT
Content-Type: text/xml; charset=UTF-8
Transfer-Encoding: chunked
mmikhan commented 3 years ago

Please inform the customer of conversation # 734729 when this conversation has been closed.

jasperw1996 commented 8 months ago

I know this is very old – but the problem still exists and I was looking for a compromise that speeds up sitemap generation without losing all images. This seems to do the trick:

/* Don't run Divi shortcodes for image extraction on sitemaps because this takes ages! */
function divi_child_disable_shortcode_resolution_on_sitemaps() {
    global $wp_filter;

    if ( isset($wp_filter['pre_get_posts']) && isset($wp_filter['pre_get_posts']->callbacks) && isset($wp_filter['pre_get_posts']->callbacks[0]) ) {
        foreach( $wp_filter['pre_get_posts']->callbacks[0] as $key=>$callbacks) {
            if ( str_contains($key, 'maybe_load_builder_modules_early') ) {
                unset($wp_filter['pre_get_posts']->callbacks[0][$key]);
            }
        }
    }
 }
add_action( 'pre_get_posts', 'divi_child_disable_shortcode_resolution_on_sitemaps', 0 );

/* generate img tags manually that can be parsed by Yoast afterwards – this is less effective, but much faster than Divi's approach */
function divi_child_prepare_images_for_sitemaps( $content ) {
    $content = preg_replace('/src="(' . preg_quote(get_home_url(), '/') . '\/[^"]+\.[a-z]{3,4})"/i', '<img src="' . '$1' . '">', $content);
    return $content;
}
add_filter( 'wpseo_sitemap_content_before_parse_html_images', 'divi_child_prepare_images_for_sitemaps' );

This only includes images that are referenced in "src" tags (mainly images in et_pb_image shortcodes), so some images will still be missing. It's a very ugly and dirty workaround, but I think you already got used to that kind of fixes anyway if you're using Divi ;)