algolia / docsearch-scraper

DocSearch - Scraper
https://docsearch.algolia.com/
Other
305 stars 106 forks source link

Algolia search breaks after running subsequent scrapes #573

Closed fredmaggiowski closed 1 year ago

fredmaggiowski commented 1 year ago

Hi, in my company we are using Algolia for our documentation site search and we are using the scraper to produce search indexes.

In our setup we only run the scraper after a deployment of changes in the documentation so that we get indexes updates when needed (the documentation has been deployed therefore there may be some content changes to scrape).

This works fine, however since we have continuous delivery for the documentation site we noticed that when multiple deployments are running (and so multiple scraping processes start), the documentation search will break and our users get a "no results" error upon searching.

We are trying to understand how to prevent this, starting from these questions:

DanRoscigno commented 1 year ago

I would run on a schedule instead of on each commit. If you look at your doc changes and determine that you have updates 20 times a day then maybe run the crawl on a schedule every 8 hours.

fredmaggiowski commented 1 year ago

Hey @DanRoscigno thank you for your reply! It makes sense to me, we'll do as you suggest.

Cheers