webcomponents / webcomponents.org

Home of the web components community
https://www.webcomponents.org
Apache License 2.0
356 stars 95 forks source link

[catalog-server] Full text search across elements #1363

Open justinfagnani opened 1 year ago

justinfagnani commented 1 year ago

We need a way to do full-text search across all elements.

GCP doesn't have a full-text search service, for some reason. We could use a service like Algolia or Elasticsearch or implement a minimal full-text search ourselves.

For a search service we need to figure out how to keep the search index live as the database changes, how to integrate with structured search, and how to mix full-text result relevance with other rankings like discussed in https://github.com/webcomponents/webcomponents.org/issues/1361. We could try to update the index as we import new packages, or run a scheduled job that updates the index X times per day and live with some lag.

For a custom search implementation we need to do some of the standard full-text search steps ourselves, but keeping the search index in sync with data updates in a bit easier, and can even be done within transactions. The steps include:

  1. Tokenizing and stemming search-contributing text fields and storing those in either the document to be searched, or a separate search table.
  2. Tokenizing and stemming the user query
  3. Performing an array-contains-any query with the user terms against the search field