[X] Code follows coding conventions held in this repo
[ ] Automated tests have been added
[X] Tests are passing
[ ] Docs have been updated (if applicable)
[X] Temporary settings (e.g. variables used during development and testing) have been reverted to defaults
How to test
Register an index using the new DistinctOptions parameter:
Service.Use<IAlgoliaIndexRegister>(new DefaultAlgoliaIndexRegister()
.Add<SiteSearchModel>(SiteSearchModel.IndexName, distinctOptions: new DistinctOptions(nameof(SiteSearchModel.DocumentName), 1))
In the search model, implement the AlgoliaSearchModel.SplitData method to generate multiple JObjects with the desired data. For example, split a large "Content" value by paragraph:
public override IEnumerable<JObject> SplitData(JObject originalData)
{
var originalId = originalData.Value<string>("objectID");
var content = originalData.Value<string>(nameof(Content));
System.Text.RegularExpressions.Match m = System.Text.RegularExpressions.Regex.Match(content, @"<p>\s*(.+?)\s*</p>");
List<string> paragraphs = new List<string>();
while (m.Success)
{
paragraphs.Add(m.Value);
m = m.NextMatch();
}
return paragraphs.Select((p, index) => {
var data = (JObject)originalData.DeepClone();
data["objectID"] = $"{originalId}-{index}";
data[nameof(Content)] = p;
return data;
});
}
On page creation/update, multiple records should be created within Algolia containing only a single paragraph. On page deletion, all "chunks" in Algolia are deleted. While searching on the front end, only the most relevant "chunk" should be returned (when using DistinctLevel of 1).
Motivation
Implements #19
Checklist
How to test
Register an index using the new
DistinctOptions
parameter:In the search model, implement the
AlgoliaSearchModel.SplitData
method to generate multipleJObject
s with the desired data. For example, split a large "Content" value by paragraph:On page creation/update, multiple records should be created within Algolia containing only a single paragraph. On page deletion, all "chunks" in Algolia are deleted. While searching on the front end, only the most relevant "chunk" should be returned (when using
DistinctLevel
of 1).