WebDevStudios / wp-search-with-algolia

Improve search on your site. Autocomplete is included, along with full control over look, feel and relevance.
https://wordpress.org/plugins/wp-search-with-algolia/
138 stars 54 forks source link

Partial indexing of posts #417

Closed mrfsrf closed 1 month ago

mrfsrf commented 2 months ago

Describe the bug A clear and concise description of what the bug is.

I have around 270 posts and looks like the plugin has indexed only 5 (of post post_type). See screenshot

Screenshot 2024-05-21 at 15 48 37 Screenshot 2024-05-21 at 15 51 23

I use Use Algolia with the native WordPress search template Did re-indexing and Push.

First thing that came to mind is cap on basic plan. But looks like I'm using only 2.28KB

Screenshot 2024-05-21 at 15 55 52

To Reproduce Steps to reproduce the behavior:

  1. Go to '...'
  2. Click on '....'
  3. Scroll down to '....'
  4. See error

Expected behavior A clear and concise description of what you expected to happen.

Screenshots If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):

Smartphone (please complete the following information):

Additional context Add any other context about the problem here.

tw2113 commented 2 months ago

Have you done any customizations and whatnot around indexing logic?

Also are you making sure you're looking at the "wp_searchable_posts" index, with whichever prefix you've chosen?

The basic plan shouldn't be a factor here yet, and the plugin should be doing what it can to keep the records under that 10kb limit via splitting.

mrfsrf commented 2 months ago

@tw2113. No customization whatsoever. Yes, see screenshot. Prefix is default one, wp_

Screenshot 2024-05-21 at 22 46 03
mrfsrf commented 2 months ago

Dont know if this will help but tried to run reindexing using wp-cli: ➜ wp-docker git:(master) ✗ ./wp algolia re-index --all --debug

[+] Building 0.0s (0/0)                                                                                                         docker:desktop-linux
 Container wp-docker-db-1  Running
 Container wp-docker-wordpress-1  Running
[+] Building 0.0s (0/0)                                                                                                         docker:desktop-linux
Debug (bootstrap): Processing bootstrap step: WP_CLI\Bootstrap\DeclareAbstractBaseCommand (0.009s)
Debug (bootstrap): Processing bootstrap step: WP_CLI\Bootstrap\IncludeFrameworkAutoloader (0.009s)
Debug (bootstrap): Processing bootstrap step: WP_CLI\Bootstrap\ConfigureRunner (0.009s)
Debug (bootstrap): Processing bootstrap step: WP_CLI\Bootstrap\IncludeRequestsAutoloader (0.009s)
Debug (bootstrap): Setting RequestsLibrary::$version to v2 (0.009s)
Debug (bootstrap): Setting RequestsLibrary::$source to wp-core (0.009s)
Debug (bootstrap): Setting RequestsLibrary::$class_name to \WpOrg\Requests\Requests (0.009s)
Debug (bootstrap): Processing bootstrap step: WP_CLI\Bootstrap\InitializeColorization (0.009s)
Debug (bootstrap): Processing bootstrap step: WP_CLI\Bootstrap\InitializeLogger (0.009s)
Debug (bootstrap): Processing bootstrap step: WP_CLI\Bootstrap\DefineProtectedCommands (0.009s)
Debug (bootstrap): Processing bootstrap step: WP_CLI\Bootstrap\LoadExecCommand (0.009s)
Debug (bootstrap): Processing bootstrap step: WP_CLI\Bootstrap\LoadRequiredCommand (0.009s)
Debug (bootstrap): Processing bootstrap step: WP_CLI\Bootstrap\IncludePackageAutoloader (0.01s)
Debug (bootstrap): Processing bootstrap step: WP_CLI\Bootstrap\IncludeFallbackAutoloader (0.01s)
Debug (bootstrap): Fallback autoloader paths: phar://wp-cli.phar/vendor/autoload.php (0.01s)
Debug (bootstrap): Loading detected autoloader: phar://wp-cli.phar/vendor/autoload.php (0.01s)
Debug (bootstrap): Attaching command 'config edit' to hook before_wp_load (0.013s)
Debug (bootstrap): Attaching command 'config path' to hook before_wp_load (0.013s)
Debug (bootstrap): Attaching command 'config list' to hook before_wp_load (0.013s)
Debug (bootstrap): Attaching command 'config get' to hook before_wp_load (0.013s)
Debug (bootstrap): Attaching command 'config is-true' to hook before_wp_load (0.013s)
Debug (bootstrap): Attaching command 'config set' to hook before_wp_load (0.014s)
Debug (bootstrap): Attaching command 'config delete' to hook before_wp_load (0.014s)
Debug (bootstrap): Attaching command 'config has' to hook before_wp_load (0.014s)
Debug (bootstrap): Attaching command 'config shuffle-salts' to hook before_wp_load (0.014s)
Debug (commands): Adding command: config (0.014s)
Debug (bootstrap): Attaching command 'core download' to hook before_wp_load (0.015s)
Debug (bootstrap): Attaching command 'core version' to hook before_wp_load (0.015s)
Debug (commands): Adding command: core (0.015s)
Debug (bootstrap): Attaching command 'eval' to hook before_wp_load (0.015s)
Debug (commands): Adding command: eval (0.015s)
Debug (bootstrap): Attaching command 'eval-file' to hook before_wp_load (0.015s)
Debug (commands): Adding command: eval-file (0.015s)
Debug (commands): Adding command: cache (0.015s)
Debug (commands): Adding command: transient (0.016s)
Debug (bootstrap): Attaching command 'core verify-checksums' to hook before_wp_load (0.016s)
Debug (commands): Adding command: verify-checksums in core Namespace (0.016s)
Debug (commands): Adding command: plugin (0.016s)
Debug (commands): Adding command: verify-checksums in plugin Namespace (0.016s)
Debug (commands): Adding command: cron (0.016s)
Debug (commands): Adding command: event in cron Namespace (0.017s)
Debug (commands): Adding command: schedule in cron Namespace (0.017s)
Debug (bootstrap): Attaching command 'db' to hook after_wp_config_load (0.018s)
Debug (bootstrap): Attaching command 'db clean' to hook after_wp_load (0.018s)
Debug (bootstrap): Attaching command 'db tables' to hook after_wp_load (0.018s)
Debug (bootstrap): Attaching command 'db size' to hook after_wp_load (0.018s)
Debug (bootstrap): Attaching command 'db prefix' to hook after_wp_load (0.018s)
Debug (bootstrap): Attaching command 'db search' to hook after_wp_load (0.018s)
Debug (bootstrap): Attaching command 'db columns' to hook after_wp_load (0.018s)
Debug (commands): Adding command: db (0.018s)
Debug (commands): Adding command: embed (0.018s)
Debug (commands): Adding command: fetch in embed Namespace (0.019s)
Debug (commands): Adding command: provider in embed Namespace (0.019s)
Debug (commands): Adding command: handler in embed Namespace (0.019s)
Debug (commands): Adding command: cache in embed Namespace (0.019s)
Debug (commands): Adding command: comment (0.02s)
Debug (commands): Adding command: meta in comment Namespace (0.02s)
Debug (commands): Adding command: menu (0.02s)
Debug (commands): Adding command: item in menu Namespace (0.02s)
Debug (commands): Adding command: location in menu Namespace (0.021s)
Debug (commands): Deferring command: network meta (0.021s)
Debug (commands): Adding command: option (0.021s)
Debug (commands): Adding command: post (0.022s)
Debug (commands): Adding command: meta in post Namespace (0.022s)
Debug (commands): Adding command: term in post Namespace (0.022s)
Debug (commands): Adding command: post-type (0.022s)
Debug (commands): Adding command: site (0.023s)
Debug (commands): Adding command: meta in site Namespace (0.023s)
Debug (commands): Adding command: option in site Namespace (0.023s)
Debug (commands): Adding command: taxonomy (0.024s)
Debug (commands): Adding command: term (0.024s)
Debug (commands): Adding command: meta in term Namespace (0.024s)
Debug (commands): Adding command: user (0.028s)
Debug (commands): Adding command: application-password in user Namespace (0.029s)
Debug (commands): Adding command: meta in user Namespace (0.029s)
Debug (commands): Adding command: session in user Namespace (0.029s)
Debug (commands): Adding command: term in user Namespace (0.029s)
Debug (commands): Adding command: network (0.029s)
Debug (hooks): Processing hook "after_add_command:network" with 1 callbacks (0.029s)
Debug (hooks): On hook "after_add_command:network": Closure in file phar:///usr/local/bin/wp/vendor/wp-cli/wp-cli/php/class-wp-cli.php at line 695 (0.029s)
Debug (commands): Adding command: meta in network Namespace (0.03s)
Debug (commands): Adding command: export (0.03s)
Debug (commands): Adding command: plugin (0.032s)
Debug (commands): Adding command: auto-updates in plugin Namespace (0.032s)
Debug (commands): Adding command: theme (0.033s)
Debug (commands): Adding command: auto-updates in theme Namespace (0.033s)
Debug (commands): Adding command: mod in theme Namespace (0.034s)
Debug (bootstrap): Attaching command 'i18n' to hook before_wp_load (0.034s)
Debug (commands): Adding command: i18n (0.034s)
Debug (bootstrap): Attaching command 'i18n make-pot' to hook before_wp_load (0.035s)
Debug (commands): Adding command: make-pot in i18n Namespace (0.035s)
Debug (bootstrap): Attaching command 'i18n make-json' to hook before_wp_load (0.035s)
Debug (commands): Adding command: make-json in i18n Namespace (0.035s)
Debug (bootstrap): Attaching command 'i18n make-mo' to hook before_wp_load (0.035s)
Debug (commands): Adding command: make-mo in i18n Namespace (0.035s)
Debug (bootstrap): Attaching command 'i18n make-php' to hook before_wp_load (0.035s)
Debug (commands): Adding command: make-php in i18n Namespace (0.035s)
Debug (bootstrap): Attaching command 'i18n update-po' to hook before_wp_load (0.036s)
Debug (commands): Adding command: update-po in i18n Namespace (0.036s)
Debug (commands): Adding command: import (0.036s)
Debug (commands): Deferring command: language core (0.037s)
Debug (commands): Deferring command: language plugin (0.037s)
Debug (commands): Deferring command: language theme (0.037s)
Debug (hooks): Immediately invoking on passed hook "after_add_command:site": Closure in file phar:///usr/local/bin/wp/vendor/wp-cli/language-command/language-command.php at line 39 (0.037s)
Debug (commands): Adding command: switch-language in site Namespace (0.038s)
Debug (commands): Adding command: language (0.038s)
Debug (hooks): Processing hook "after_add_command:language" with 3 callbacks (0.038s)
Debug (hooks): On hook "after_add_command:language": Closure in file phar:///usr/local/bin/wp/vendor/wp-cli/wp-cli/php/class-wp-cli.php at line 695 (0.038s)
Debug (commands): Adding command: core in language Namespace (0.038s)
Debug (hooks): On hook "after_add_command:language": Closure in file phar:///usr/local/bin/wp/vendor/wp-cli/wp-cli/php/class-wp-cli.php at line 695 (0.038s)
Debug (commands): Adding command: plugin in language Namespace (0.038s)
Debug (hooks): On hook "after_add_command:language": Closure in file phar:///usr/local/bin/wp/vendor/wp-cli/wp-cli/php/class-wp-cli.php at line 695 (0.038s)
Debug (commands): Adding command: theme in language Namespace (0.038s)
Debug (bootstrap): Attaching command 'maintenance-mode' to hook after_wp_load (0.038s)
Debug (commands): Adding command: maintenance-mode (0.038s)
Debug (commands): Adding command: media (0.039s)
Debug (bootstrap): Attaching command 'package' to hook before_wp_load (0.041s)
Debug (commands): Adding command: package (0.041s)
Debug (commands): Adding command: rewrite (0.041s)
Debug (commands): Adding command: rewrite (0.041s)
Debug (commands): Adding command: cap (0.042s)
Debug (commands): Adding command: role (0.042s)
Debug (commands): Adding command: scaffold (0.043s)
Debug (commands): Adding command: search-replace (0.044s)
Debug (bootstrap): Attaching command 'server' to hook before_wp_load (0.044s)
Debug (commands): Adding command: server (0.044s)
Debug (commands): Adding command: shell (0.044s)
Debug (commands): Adding command: super-admin (0.045s)
Debug (commands): Adding command: widget (0.045s)
Debug (commands): Adding command: sidebar (0.045s)
Debug (bootstrap): Processing bootstrap step: WP_CLI\Bootstrap\RegisterFrameworkCommands (0.045s)
Debug (bootstrap): Adding framework command: phar://wp-cli.phar/vendor/wp-cli/wp-cli/php/commands/cli.php (0.045s)
Debug (bootstrap): Attaching command 'cli' to hook before_wp_load (0.046s)
Debug (bootstrap): Attaching command 'cli has-command' to hook after_wp_load (0.047s)
Debug (commands): Adding command: cli (0.047s)
Debug (bootstrap): Attaching command 'cli cache' to hook before_wp_load (0.047s)
Debug (commands): Adding command: cache in cli Namespace (0.047s)
Debug (bootstrap): Attaching command 'cli alias' to hook before_wp_load (0.047s)
Debug (commands): Adding command: alias in cli Namespace (0.047s)
Debug (bootstrap): Adding framework command: phar://wp-cli.phar/vendor/wp-cli/wp-cli/php/commands/help.php (0.047s)
Debug (commands): Adding command: help (0.047s)
Debug (bootstrap): Processing bootstrap step: WP_CLI\Bootstrap\RegisterDeferredCommands (0.047s)
Debug (bootstrap): Processing bootstrap step: WP_CLI\Bootstrap\InitializeContexts (0.047s)
Debug (bootstrap): Processing bootstrap step: WP_CLI\Bootstrap\LaunchRunner (0.048s)
Debug (bootstrap): No readable global config found (0.048s)
Debug (bootstrap): No project config found (0.048s)
Debug (bootstrap): argv: /usr/local/bin/wp algolia re-index --all --debug (0.048s)
Debug (bootstrap): ABSPATH defined: /var/www/html/ (0.048s)
Debug (hooks): Executing hook: before_wp_load (0.048s)
Debug (context): Using context 'cli' (0.049s)
Debug (bootstrap): Begin WordPress load (0.049s)
Debug (bootstrap): wp-config.php path: /var/www/html/wp-config.php (0.049s)
Debug (bootstrap): Looking for UTF-8 BOM (0.05s)
Debug (bootstrap): Looking for UTF-16 (BE) BOM (0.05s)
Debug (bootstrap): Looking for UTF-16 (LE) BOM (0.05s)
Debug (hooks): Executing hook: after_wp_config_load (0.051s)
Debug (bootstrap): Attaching command 'yoast cleanup' to hook after_wp_load (1.033s)
Debug (commands): Adding command: yoast (1.033s)
Debug (bootstrap): Attaching command 'yoast index' to hook after_wp_load (1.075s)
Debug (commands): Adding command: yoast (1.075s)
Debug (commands): Adding command: algolia (1.092s)
Debug (commands): Adding command: timber (1.58s)
Debug (bootstrap): Loaded WordPress (1.662s)
Debug (hooks): Processing hook "before_run_command" with 1 callbacks (1.662s)
Debug (hooks): On hook "before_run_command": WP_CLI\Bootstrap\RegisterDeferredCommands->add_deferred_commands() (1.662s)
Debug (bootstrap): Running command: algolia re-index (1.662s)
Success: Index wp_searchable_posts was created but no entries were sent.
tw2113 commented 2 months ago

Hmm.

Success: Index wp_searchable_posts was created but no entries were sent.

This is indicating that somehow it's not finding enough to act on, as it's found 0 pages worth of posts, somehow. I'm trusting that you haven't tinkered with the batch size either, which is filterable.

The re-index items count query is coming from:

$query = new WP_Query(
    array(
        'post_type'              => $this->post_types,
        'post_status'            => 'any', // Let the `should_index` take care of the filtering.
        'suppress_filters'       => true,
        'cache_results'          => false,
        'lazy_load_term_meta'    => false,
        'update_post_term_cache' => false,
    )
);

with $this->post_types being all post types in the install that are marked as searchable.

Any difference with just running ./wp algolia re-index wp_searchable_posts --debug ?

Also just in case, are you also making use of WP Search with Algolia Pro? Asking because I know we have integration with "noindex" settings via Yoast, and if you have some marked as such, that would affect your index volume.

mrfsrf commented 2 months ago

Any difference with just running ./wp algolia re-index wp_searchable_posts --debug ?

had to use without prefix because Error: Index with id "wp_searchable_posts" does not exist. Make sure you don't include the prefix. Looks like i get same results.

so: ./wp algolia re-index searchable_posts --debug

[+] Building 0.0s (0/0)                                                                                                                                                                                      docker:desktop-linux
 Container wp-docker-db-1  Running
 Container wp-docker-wordpress-1  Running
[+] Building 0.0s (0/0)                                                                                                                                                                                      docker:desktop-linux
Debug (bootstrap): Processing bootstrap step: WP_CLI\Bootstrap\DeclareAbstractBaseCommand (0.006s)
Debug (bootstrap): Processing bootstrap step: WP_CLI\Bootstrap\IncludeFrameworkAutoloader (0.007s)
Debug (bootstrap): Processing bootstrap step: WP_CLI\Bootstrap\ConfigureRunner (0.007s)
Debug (bootstrap): Processing bootstrap step: WP_CLI\Bootstrap\IncludeRequestsAutoloader (0.007s)
...
Debug (bootstrap): Loaded WordPress (0.72s)
Debug (hooks): Processing hook "before_run_command" with 1 callbacks (0.72s)
Debug (hooks): On hook "before_run_command": WP_CLI\Bootstrap\RegisterDeferredCommands->add_deferred_commands() (0.72s)
Debug (bootstrap): Running command: algolia re-index (0.72s)
Success: Index wp_searchable_posts was created but no entries were sent.

Also just in case, are you also making use of WP Search with Algolia Pro?

No. Basic free tier setup

Do you have knowledge about any conflicts with other plugins or certain version of WP?

[Notes]:

Interesting thing when i run algolia from wp-cli to re-index posts and afterwards click push Settings from Admin area, i get different # of records in Algolia dashboard

mrfsrf commented 2 months ago

also tried the following: inside class-algolia-searchable-posts-index.php, added the error log:

    /**
     * Get re-index items count.
     *
     * @author WebDevStudios <contact@webdevstudios.com>
     * @since  1.0.0
     *
     * @return int
     */
    protected function get_re_index_items_count() {
        $query = new WP_Query(
            array(
                'post_type'              => $this->post_types,
                'post_status'            => 'any', // Let the `should_index` take care of the filtering.
                'suppress_filters'       => true,
                'cache_results'          => false,
                'lazy_load_term_meta'    => false,
                'update_post_term_cache' => false,
            )
        );

    error_log('Re-index items count: ' . print_r($query->found_posts, true));

        return (int) $query->found_posts;
    }

/wp-content/debug.log

[22-May-2024 07:54:38 UTC] Re-index items count: 0
[22-May-2024 07:54:38 UTC] Re-index items count: 0

EDIT:

I think the problem lies in $this->post_type, it reeturns this: from debug.log

protected function get_re_index_items_count() {
  error_log('Post type: ' . print_r($this->post_type, true));
   ....
 }

 [22-May-2024 07:58:43 UTC] PHP Warning:  Undefined property: Algolia_Searchable_Posts_Index::$post_type in /var/www/html/wp-content/plugins/wp-search-with-algolia/includes/indices/class-algolia-searchable-posts-index.php on line 421
[22-May-2024 07:58:43 UTC] Post type: 
mrfsrf commented 2 months ago

Found a solution:

  1. Update WP to 6.5
  2. Modify function get_re_index_items_count

    /**
     * Get re-index items count.
     *
     * @author WebDevStudios <contact@webdevstudios.com>
     * @since  1.0.0
     *
     * @return int
     */
    protected function get_re_index_items_count() {
    // Ensure $this->post_types is set, default to 'post'
    $post_types = $this->post_types ?? array('post');
    
    // Ensure $post_types is always an array
    if (!is_array($post_types)) {
        $post_types = (array) $post_types;
    }
    
    $query_args = array(
        'post_type'              => $post_types,
        'post_status'            => 'any', // Let the `should_index` take care of the filtering.
        'suppress_filters'       => true,
        'cache_results'          => false,
        'lazy_load_term_meta'    => false,
        'update_post_term_cache' => false,
        'posts_per_page'         => -1, // To get all posts
    );
    
    // Debugging the query args
    error_log('WP_Query args: ' . print_r($query_args, true));
    
    $query = new WP_Query($query_args);
    
    // // Debugging the WP_Query object
    // error_log('WP_Query object: ' . print_r($query, true));
    
    // Access the total number of found posts
    // $total_posts = $query->found_posts;
    
    // Edge Case: In case the $query doesn't have `found_posts` property or if value of those are not the same:
    $total_posts = ($query->found_posts !== count($query->posts)) ? count($query->posts) : $query->found_posts;
    
    // Output to debug log
    error_log('Re-index items count: ' . count($query->posts));
    
    return (int) $total_posts;
    }

    [Note:]

    For some reason WP_Query doesn't have found_posts property. That is probably becuase somehwere globally no_found_rows is set to true and thus there is no found_posts [Update:] Yes. i have set that in my theme:

    add_action('pre_get_posts', 'set_timber_query_defaults');
    function set_timber_query_defaults($query) {
    $query->set('ignore_sticky_posts', true);
    $query->set('suppress_filters', true);
    $query->set('no_found_rows', true);
    
    return $query;
    }

Also had to include posts_per_page => -1 to get all posts

  1. Results

Screenshot 2024-05-22 at 11 08 48

tw2113 commented 2 months ago

had to use without prefix because Error: Index with id "wp_searchable_posts" does not exist. Make sure you don't include the prefix.

I always tend to forget that part, the detail about not including prefix.

Do you have knowledge about any conflicts with other plugins or certain version of WP?

Nothing outright known, but I could believe potential if they're running filters that affect results for WP_Query calls.

Interesting thing when i run algolia from wp-cli to re-index posts and afterwards click push Settings from Admin area, i get different # of records in Algolia dashboard

Shouldn't need to push settings regularly unless you've done some filtering/changes to the settings via code. For example, if you set a new attribute to be filterable. You'd want to use the push settings button at that time. However if you are simply including a new attribute to be indexed in general, you wouldn't need to push settings, but would need to do a re-index so that all the indexed items get that new attribute included.

[22-May-2024 07:58:43 UTC] PHP Warning: Undefined property: Algolia_Searchable_Posts_Index::$post_type in /var/www/html/wp-content/plugins/wp-search-with-algolia/includes/indices/class-algolia-searchable-posts-index.php on line 421

This one I find a bit odd because we don't have a property of "$post_type" used, but we do have "$post_types" plural used. So I would be curious what your edit with error_log('Post type: ' . print_r($this->post_types, true)); would have resulted in.

https://github.com/WebDevStudios/wp-search-with-algolia/blob/2.8.1/includes/indices/class-algolia-searchable-posts-index.php#L420-L433

add_action('pre_get_posts', 'set_timber_query_defaults');
function set_timber_query_defaults($query) {
  $query->set('ignore_sticky_posts', true);
  $query->set('suppress_filters', true);
  $query->set('no_found_rows', true);

  return $query;
}

You would probably gain a lot with returning early in this with an if statement checking if is_search(). Not familiar enough with Timber to know if that defaults query is going to be a main query or secondary. So if ( $query->is_main_query() ) { return; } may also be beneficial.

That said, you shouldn't NEED to amend the posts per page, as the plugin is built to handle all of that calculation. So I'm curious what your results would be without the posts_per_page part and the early return with the add_action('pre_get_posts', 'set_timber_query_defaults'); callback, and then setting everything else back to how it was.

mrfsrf commented 2 months ago

Ok. Thanks, will try to modify that Hook for Timber query and get back with response. One last question. Is there a hook to get total search results?

tw2113 commented 2 months ago

Total search results in what way? How many were found for a query? or how many should be getting indexed or were successfully indexed?

mrfsrf commented 2 months ago

How many were found for a query

tw2113 commented 2 months ago

With "use with native search" option, we're just also pre_get_posts filtering ourselves, except we filter in the post__in to say exactly which post IDs to fetch based on the returned Algolia results for query + page, so any native total posts results should also include that. If not, then that may be something we could try to figure out how to update. I know Instantsearch would return its own values.

mrfsrf commented 1 month ago

Forgot to mark this as solved and close the issue