wintercms / wn-search-plugin

Enables full-text search capabilities in Winter.
MIT License
7 stars 2 forks source link

Add support for fuzzy searching, result grouping and labelling, and relevance ordering #7

Closed bennothommo closed 4 months ago

bennothommo commented 4 months ago

This PR brings in the following features:

Fuzzy searching

Database and Collection indexes in Laravel Scout do not apply any sort of fuzzy searching, and thus tend to be very conservative in the results provided for a given query, especially if the query uses more than one word.

To allow for more results with these types of indexes, the Search component now provides a "Fuzzy search" property, which can be ticked. When ticked, the search query will be pre-processed to make it more fuzzy, which includes the following:

This may provide more results for any given query, with a slight penalty for relevance.

This should be avoided on any index engine that provides its own fuzzy searching capabilities, such as Algolia or Meilisearch.

Result grouping and labelling

The search plugin now allows results to be grouped, and to have labels to decorate the result. Both can be used to contextualise the results.

When using the Search component, you may enable result grouping to provide results in groups by ticking the "Enable grouping?" property. You can also limit the amount of results to display in each group. This can, for example, allow you to only show a certain number of pages in a section of pages so that different sections can be shown in the list of results earlier.

Relevance ordering

By default, results in the Search plugin are not ordered in any particular way to account for the relevancy of the result. While this may be fine in cases where you are using index engines like Algolia or Meilisearch, which have their own relevancy algorithms (or can be configured as such), this may affect the results when using the Database and Collection index engines, which have no relevancy system.

In order to support a level of relevancy in these engines, results can be post-processed after being retrieved from the index to assign a relevancy score. Relevancy in this system is determined by the order of the property names in the $searchable definition for the index.

For example, with the below definition:

public $searchable = [
    'title',
    'description',
    'keywords',
];

The title field will be the most relevant field, followed by the description and then keywords. A match in the title is given greater weight than a match in the description, which is given more weight than a match in the keywords.

The relevancy scoring also takes into account the words used in the query. For example, if the query install winter cms was used, the word install is given more weight than winter, which is then given more weight than the word cms.

To enable result relevancy, you may call the getWithRelevance() or firstRelevant() methods following any search:

$results = \Acme\Blog\BlogSearch::doSearch('install winter cms')->getWithRelevance();

getWithRelevance() will retrieve all records, ordered by relevancy score, whilst firstRelevant() will retrieve only the most relevant record.

If you wish to customise the relevancy scoring, you may also provide a callable to both methods. The callable must accept two arguments, a model instance, and an array of words from the query, and return a score as a float or an int, which will be used to order the results descendingly (higher score = more relevant). Each record found in the index will be run through this callable.

$results = \Acme\Blog\BlogSearch::doSearch('install winter cms')->getWithRelevance(function ($model, array $words) {
    // Score each record and return score as a integer or float.
});