humanmade / authorship

A modern approach to author attribution in WordPress.
GNU General Public License v3.0
66 stars 7 forks source link

Avoid the `wp_insert_post` action from compounding when multiple posts are inserted #134

Closed johnbillion closed 11 months ago

johnbillion commented 11 months ago

Authorship suffers from a compounding query problem when inserting multiple posts, for example during a bulk content import where wp_insert_post() is called many times.

With each successive call to wp_insert_post() Authorship performs an ever-larger number of queries to set the post author, and it compounds quickly. This is because an action is added to the wp_insert_post from within a filter on wp_insert_post_data. Each time wp_insert_post_data gets called (once per call to wp_insert_post()) the wp_insert_post action gets added again. If you insert 10,000 posts then by the end of it the wp_insert_post hook is performing 10,000 actions each time, once for each closure.

This PR addresses the problem by moving the wp_insert_post_data filter and wp_insert_post action into class methods so they're only added once and the $postarr state can be passed between them without the need for a closure. The whole reason this code is needed is because WordPress does not pass the original $postarr value to the wp_insert_post hook.

All existing tests pass and I've added a new one to cover this.

johnbillion commented 11 months ago

You're correct on both points but these two functions have just been copied into the class methods with their existing logic.

goldenapples commented 11 months ago

I think my preference would have been to have the action on wp_insert_post remove itself after running, and then let the filter on wp_insert_post_data add a new hook on wp_insert_post each time it runs. Would

$action_wp_insert_post = 
    function( int $post_ID, WP_Post $post, bool $update ) 
        use ( $unsanitized_postarr, &$action_wp_insert_post ) {
            remove_action( 'wp_insert_post', $action_wp_insert_post, 10, 3 );
            ...

have worked in this case?

This is just stylistic though - this way of instantiating a pseudo-singleton feels like it goes against our coding style, but it works just fine

johnbillion commented 11 months ago

I believe that won't work because the $action_wp_insert_post reference won't exist at the point where the closure is instantiated so you can't use it. Not 100% sure though.

rmccue commented 11 months ago

I believe that won't work because the $action_wp_insert_post reference won't exist at the point where the closure is instantiated so you can't use it. Not 100% sure though.

You can use it by reference and it'll work, but I'd say it's not super idiomatic; generally, I'd use a static mutex instead:

add_action( 'hook', function () {
    static $is_running = false;
    if ( $is_running ) {
        return;
    }

    $is_running = true;
    // do_the_thing();
    $is_running = false;
} );
Nikschavan commented 8 months ago

Is there plan to release this? This fix is not released yet in a new tag and packagist yet.