Automattic / msm-sitemap

Comprehensive sitemaps for your WordPress VIP site. Joint collaboration between Metro.co.uk, WordPress VIP, Alley Interactive, Maker Media, 10up, and others.
73 stars 37 forks source link

Deleted posts are not removed #112

Open mjangda opened 7 years ago

mjangda commented 7 years ago

Reproduction Steps:

The sitemap for today should no longer have the post but still does. This is because out update cron only looks through the post_modified column. Posts that were deleted are not included.

mjangda commented 7 years ago

This is a bit tricky. Instead of looking for posts based on the last modified date, it might make sense to hook into transition post status instead and track the dates for those posts (in an option) and then use those instead. Something like:

add_action( 'transition_post_status', function( $new_status, $old_status, $post ) {
    if ( 'publish' !== $new_status || 'publish' !== $old_status ) {
        return;
    }

    $dates_to_update = get_option( 'msm_sitemap_next_run_update_dates', array() );
    $dates_to_update[] = date( 'Y-m-d', strtotime( $post->post_date ) );
    $dates_to_update = array_unique( $dates_to_update );
    update_option( 'msm_sitemap_next_run_update_dates', $dates_to_update, false );
}, 10, 3 ); 

Not sure what the implications of this are. There may be race conditions if a post is updated while the cron is triggering (although, I think those can probably happen now too).

mattoperry commented 7 years ago

I was thinking something very similar, except usinguntrash_post and trashed_post ... however, I think your way is better because it would save scheduling jobs for stuff being trashed from other statuses (like draft.)

@mjangda can you point me toward msm_sitemap_next_run_update_dates? Where else is that option used?

mjangda commented 7 years ago

can you point me toward msm_sitemap_next_run_update_dates? Where else is that option used?

It doesn't exist yet :) My thinking was that we can add it and update update_sitemap_from_modified_posts to use it instead of the get_last_modified_posts lookup.

mattoperry commented 7 years ago

Makes sense :-)