WebDevStudios / wp-search-with-algolia

Improve search on your site. Autocomplete is included, along with full control over look, feel and relevance.
https://wordpress.org/plugins/wp-search-with-algolia/
138 stars 54 forks source link

Large Data Indexing time #372

Open abdulkadernsu opened 10 months ago

abdulkadernsu commented 10 months ago

I have 250k + product on my woocommerce site. I have implemented wp search with algolia for searching. The fact is it is taking too much time to get indexing. I have counted only "100 data get indexed in 30 sec around". How can I do fast indexing? Also if I update my products in future, how it will sync with algolia without reindexing again? Thanks

tw2113 commented 10 months ago

Are you going through the browser or are you going through something like command line which will probably process faster. You're still beholden to the server and resources available to it at any given time. I also know we have some filters available for stuff like batch sizes that could be tweaked.

Simply put, a huge amount of content is going to take awhile to get indexed, especially initially with an empty index.

Adding/updating individual products as you go along updates the indexes as you go along as well. The primary times when you'd need to do a bulk re-index would be if you change what data gets indexed. At that point, you'd need to re-index to retroactively push that data in for all your already indexed items.

abdulkadernsu commented 10 months ago

Um using browser.. But will love to do with cmd line if it gives faster experience. But can I get a documentation to do indexing using cmd? + how can I increase the batch size?

samfrank commented 10 months ago

+1 for linking to the documentation on how to index via the command line - We are having the same issue

samfrank commented 10 months ago

@abdulkadernsu Hey! Just had a look at the wiki and I found some documentation on it https://github.com/WebDevStudios/wp-search-with-algolia/wiki/WP-CLI

samfrank commented 10 months ago

I dont know if it was faster, but I had an issue with WP Engine killing long processes after 60s so I couldn't index everything. This bypasses this issue

tw2113 commented 10 months ago

Correct on the wiki URL for our documentation with WP-CLI

WP-Engine's advanced panel does definitely visually "time out" after that amount of time but I believe the actual commands keep running in the background. Ideally with WP-Engine you'd SSH in and use a proper terminal outside of the browser.

Filters: https://github.com/WebDevStudios/wp-search-with-algolia/wiki/Filter-Hooks#filters-reference

Specifically algolia_indexing_batch_size which defaults to integer 100

samfrank commented 10 months ago

Great to know, thanks @tw2113

tw2113 commented 9 months ago

Based on a lot of my recent work and findings, I want to clarify that WP-CLI isn't necessarily going to be faster, but it is very useful for automation since you can run commands in cron jobs, and something like wp algolia re-index would be an easy one to do up, among many other combinations.

tw2113 commented 4 months ago

@abdulkadernsu I wonder if some of the filters available in https://github.com/WebDevStudios/wp-search-with-algolia/wiki/Timeouts may help, especially on the cURL side of things.

I also know I have a new filter in the works that will help with configuring https://www.algolia.com/doc/api-reference/api-methods/configuring-timeouts/ that may be a secondary thing to try out.