Kentico / xperience-algolia

Enables the creation of Algolia search indexes and the indexing of Xperience content tree pages using a code-first approach.
https://www.kentico.com
MIT License
1 stars 3 forks source link

Feat/distinct index #26

Closed kentico-ericd closed 1 year ago

kentico-ericd commented 2 years ago

Motivation

Implements #19

Checklist

How to test

Register an index using the new DistinctOptions parameter:

Service.Use<IAlgoliaIndexRegister>(new DefaultAlgoliaIndexRegister()
                .Add<SiteSearchModel>(SiteSearchModel.IndexName, distinctOptions: new DistinctOptions(nameof(SiteSearchModel.DocumentName), 1))

In the search model, implement the AlgoliaSearchModel.SplitData method to generate multiple JObjects with the desired data. For example, split a large "Content" value by paragraph:

public override IEnumerable<JObject> SplitData(JObject originalData)
{
    var originalId = originalData.Value<string>("objectID");
    var content = originalData.Value<string>(nameof(Content));
    System.Text.RegularExpressions.Match m = System.Text.RegularExpressions.Regex.Match(content, @"<p>\s*(.+?)\s*</p>");
    List<string> paragraphs = new List<string>();
    while (m.Success)
    {
        paragraphs.Add(m.Value);
        m = m.NextMatch();
    }

    return paragraphs.Select((p, index) => {
        var data = (JObject)originalData.DeepClone();
        data["objectID"] = $"{originalId}-{index}";
        data[nameof(Content)] = p;
        return data;
    });
}

On page creation/update, multiple records should be created within Algolia containing only a single paragraph. On page deletion, all "chunks" in Algolia are deleted. While searching on the front end, only the most relevant "chunk" should be returned (when using DistinctLevel of 1).

kentico-ericd commented 1 year ago

Closing- this will be implemented by https://github.com/Kentico/xperience-algolia/pull/29