WordPress / wordpress-importer

The WordPress Importer
https://wordpress.org/plugins/wordpress-importer/
GNU General Public License v2.0
78 stars 76 forks source link

Improve the post_exists query time during post import process. #162

Open dhusakovic opened 5 months ago

dhusakovic commented 5 months ago

During the post import we check whether the post already exists using post_exists.

The query currently utilises post title and date parameters, which will run through all the posts to try and find a match.

In cases where we have a large number of posts this query can time out and the import fail.

Passing the post_type to post_exists function allows us to use the type_status_date key, which significantly drops the number of rows examined and the overall query time in certain instances.

This change would also fix some edge case scenarios with posts across post types having the same title and date.

The update would be passing post_type argument to the check here: https://github.com/WordPress/wordpress-importer/blob/master/src/class-wp-import.php#L659

So it would look like this $post_exists = post_exists( $post['post_title'], '', $post['post_date'], $post['post_type'] );

dd32 commented 5 months ago

This looks like a safe change, a few lines later where $post_exists is used, the post_type is checked that it matches.. so querying only for that post_type shouldn't have any adverse effects.

dhusakovic commented 5 months ago

Thanks for checking this out @dd32 🙇
What's the process for creating a PR to address the issue? It looks like I don't have necessary access rights to push a branch up.

rebeccahum commented 4 months ago

@dhusakovic You'll probably need to fork the repo