Closed McGern closed 4 months ago
Hi there @McGern!
Firstly, a big thank you for raising this issue. Every piece of feedback we receive helps us to make Umbraco better.
We really appreciate your patience while we wait for our team to have a look at this but we wanted to let you know that we see this and share with you the plan for what comes next.
We wish we could work with everyone directly and assess your issue immediately but we're in the fortunate position of having lots of contributions to work with and only a few humans who are able to do it. We are making progress though and in the meantime, we will keep you in the loop and let you know when we have any questions.
Thanks, from your friendly Umbraco GitHub bot :robot: :slightly_smiling_face:
Hey @McGern as far as I understand the codebase, we pass on the string of the filter into examine, since lucene under examine treats spaces as term separators by default, this seems to be expected behavior.
A quick workaround I learned from having to build search on examine/lucene is as follows. Manipulate the term generation to also include an "exact" term by concatenating all terms in a property. Save this next to (in a separate field or same field). This then allows you to search on partial terms and full terms. I updated the code example with this.
/umbraco/delivery/api/v2/content?filter=author:smith
/umbraco/delivery/api/v2/content?filter=author:gary%20smith
/umbraco/delivery/api/v2/content?filter=author:gary_smith
using Umbraco.Cms.Core.DeliveryApi;
using Umbraco.Cms.Core.Models;
namespace umb15920;
public class AuthorFilter : IFilterHandler, IContentIndexHandler
{
private const string AuthorSpecifier = "author:";
private const string FieldName = "author";
// Querying
public bool CanHandle(string query)
=> query.StartsWith(AuthorSpecifier, StringComparison.OrdinalIgnoreCase);
public FilterOption BuildFilterOption(string filter)
{
var fieldValue = filter.Substring(AuthorSpecifier.Length);
// There might be several values for the filter
var values = fieldValue.Split(',').ToArray();
return new FilterOption
{
FieldName = FieldName,
Values = values,
Operator = FilterOperation.Is
};
}
// Indexing
public IEnumerable<IndexFieldValue> GetFieldValues(IContent content, string? culture)
{
string? author = content.GetValue<string>("author");
if (string.IsNullOrWhiteSpace(author))
{
return Array.Empty<IndexFieldValue>();
}
return new[]
{
new IndexFieldValue
{
FieldName = FieldName,
Values = new object[] { author, author.EnsureSingleTerm() }
}
};
}
public IEnumerable<IndexField> GetFields() => new[]
{
new IndexField
{
FieldName = FieldName,
FieldType = FieldType.StringSortable,
VariesByCulture = false
}
};
}
public static class ExamineStringExtensions
{
public static string EnsureSingleTerm(this string input)
=> input.Replace(' ', '_');
}
@Migaroez Thanks for your detailed feedback, it's very much appreciated.
Which Umbraco version are you using? (Please write the exact version, example: 10.1.0)
13.2.2
Bug summary
In the content delivery api, when indexing fields using the FieldType.StringSortable value for the IndexField, values that have spaces in them seem to be treated like separate indexes
Specifics
When using StringRaw it needs exact values including casing, when using StringSortable casing is ignored (desired) but spaced words are indexed separately. Note that FilterOperation.Is has been applied to the BuildFilterOptions.
I'm not sure if this is a bug or the expected behaviour, but I couldn't find any information in the docs about what the FieldType values are expected to do.
Steps to reproduce
Create a fresh install of Umbraco, and used the sample filter https://docs.umbraco.com/umbraco-cms/reference/content-delivery-api/extension-api-for-querying#custom-filter, stripping out the author guid lookup and replacing with simple string.
Setup content delivery api as documented
Note the FieldType.StringSortable in the GetFields method and FilterOperation.Is in the BuildFilterOption
Created an document type "Article" with only "author" as a textstring prop
Add a few article with varying author names Article 1 - Gary Smith Article 2 - John Smith Article 3 - Joan Thomson
Query with e.g. https://localhost:44338/umbraco/delivery/api/v2/content?filter=author:Smith
Expected result / actual result
Querying with /umbraco/delivery/api/v2/content?filter=author:Smith
Resulted in 2 items (A1 - Gary Smith and A2 - John Smith), would have expected 0
/umbraco/delivery/api/v2/content?filter=author:Gary
Resulted in 1 item (A1 - Gary Smith), would have expected 0
/umbraco/delivery/api/v2/content?filter=author:Gary%20smith
Resulted in 2 items (A1 - Gary Smith and A2 - John Smith, would have expected 1