Epinova / Epinova.Elasticsearch

A search-plugin for Episerver CMS and Commerce
MIT License
29 stars 20 forks source link

Is it possible to Filter on Category and a list of Category.ID with OR? #178

Closed jonascarlbaum closed 1 year ago

jonascarlbaum commented 2 years ago

Is there a strategy for filtering on any Category.ID in PageData.Category using some or-strategy?

Can't seems to get anything to work:

var query = _searchService.Get<BasePageData>().FilterByACL();

query = query.Filter("Category", new int[] {39, 40}, true, Operator.Or);
query = query.Filter("Category", new int[] {39, 40}, false, Operator.Or);
query = query.Filters("Category", new int[] {39, 40}, Operator.Or);
query = query.FilterGroup(group => group.Or(page => page.GetCategories() /*some extension*/, new int[] {39, 40}));

Is this a missing feature, or am I approaching this the wrong way? It's pretty simple getting a match if left or right side is array and the other is one value, but if both sides are collections, there seems to be no way to make this work?

If there is a strategy, it would really fit into the documentation, because this seems to be a tricky one...

The expected results for me is when there is at least one item in union, so if page has at least 39 and/or 40 in its category list, it's considered a match.

jonascarlbaum commented 2 years ago

Maybe a new method: FiltersAny<IEnumerable<TType>>(Expression<Action<T>> fieldSelector, IEnumerable<TType> filterValues, bool raw = true)

...or something?

otanum commented 1 year ago

I would index the CategoryList property as an array of integers.

Indexing.Instance.ForType<BasePageData>().IncludeField(x => x.CategoryArray());

Then use FilterGroup syntax above with the same CategoryArray() method.

jonascarlbaum commented 1 year ago

Thank you for your feedback @otanum!

I've tried that one, and I can't get it to work with the FilterGroup anyhow. restarted IIS and removed, recreated and reindexed the index.

I'm having this Initialization:

Epinova.ElasticSearch.Core.Conventions.Indexing.Instance
    .ExcludeType<ErrorPage>()
    .ExcludeType<StartPage>()
    .ExcludeType<SettingsPage>()
    .IncludeFileType("txt")
    .IncludeFileType("doc")
    .IncludeFileType("docx")
    .IncludeFileType("xls")
    .IncludeFileType("xlsx")
    .IncludeFileType("ppt")
    .IncludeFileType("pptx")
    .IncludeFileType("pdf")
    .ForType<ISearchSection>().IncludeProperty(x => x.SearchSection)
    .ForType<EventPage>().IncludeProperty(x => x.StartDate)
    .ForType<IContentCategorizable>().IncludeField("Category", x => x.Category.ToList())
    .ForType<BasePageData>().IncludeField(x => x.CategoryArray());

I even have this on BasePageData:

[Searchable]
public int[] Categories => Category != null && !Category.IsEmpty ? Category.ToArray() : new int[] { };

And did the extension like this:

public static class BasePageDataExtensions
{
    public static int[] CategoryArray(this BasePageData input) => input.Categories ?? new int[] { };
}

So, now I have three indexed int-arrays... ;)

The following works when one:

query = query.Filters(x => x.CategoryArray(), new int[] {39}, Operator.Or);
query = query.Filters(x => x.CategoryArray(), new int[] {40}, Operator.Or);
query = query.Filters("Category", new int[] {39}, Operator.Or);
query = query.Filters("Category", new int[] {40}, Operator.Or);
query = query.Filters(x => x.Categories, new int[] {39}, Operator.Or);
query = query.Filters(x => x.Categories, new int[] {40}, Operator.Or);

This doesn't:

query = query.Filters(x => x.CategoryArray(), new int[] {39, 40}, Operator.Or);
query = query.Filters("Category", new int[] {39, 40}, Operator.Or);
query = query.Filters(x => x.Categories, new int[] {39, 40}, Operator.Or);
query = query.FilterGroup(group => group.Or(page => page.GetCategories(), new int[] {39, 40}));
query = query.FilterGroup(group => group.Or(page => page.GetCategories(), new int[] {39}));
query = query.FilterGroup(group => group.Or(page => page.GetCategories(), new int[] {40}));
query = query.FilterGroup(group => group.Or(page => page.Categories, new int[] {39, 40}));
query = query.FilterGroup(group => group.Or(page => page.Categories, new int[] {39}));
query = query.FilterGroup(group => group.Or(page => page.Categories, new int[] {40}));

I mean, all I want is to return all pages where any category on page matches any category id. So in this case I now there are 39 and 40 on the pages, two with 39, and one with 40, so if the end user clicks the two checkboxes to see both 39 and 40 I would like this code to return all tree pages.

I can't see where I think wrong, or if there is a bug?

otanum commented 1 year ago

Hi again @jonascarlbaum. I have reproduced the int[] error. Unfortunately I have no time for fixing elastic bugs at the moment.

Quick fix: Let CategoryArray return string[].

jonascarlbaum commented 1 year ago

Ah, great, that is a quick fix I can live with.

Thank you @otanum — I’ll try it tomorrow.

It’s good to know you could reproduce it, then it’s confirmed, and no need for me to keep bang my head against the wall…

otanum commented 1 year ago

Note to self!

Code to reproduce:

.Filter(p => p.IntArray(), new[] { 9530, 4538 })
.Filter(p => p.StringArray(), new[] { "9530", "4538" })

Query with missing integers from array

"post_filter": {
    "bool": {
      "must": [
        {
          "terms": {
            "IntArray": []
          }
        },
        {
          "terms": {
            "StringArray": [
              "9530",
              "4538"
            ]
          }
        }
      ]
    }
  }
otanum commented 1 year ago

Hi again @jonascarlbaum . Could you please verify package 11.7.4.251-pre? Please verify all code versions above.

I adjusted the json converter. I have only verified .Filter with int[].

jonascarlbaum commented 1 year ago

Hi @otanum! Will check it out, but I'm a little stuck with other stuff at the moment, so I can't do it at the moment.

FYI: your workaround is working as expected, so I'll check this out when other features etc. is completed. And I have done the CMS12 beta cert... ;)

otanum commented 1 year ago

Still awaiting feedback @jonascarlbaum . Changes not merged to main branch.

jonascarlbaum commented 1 year ago

Hi @otanum!

I sadly doesn’t have time to test this one out. Too many urgent tasks on too many customers…

I feel this seems a good fix that should be merged into master!? 😉

otanum commented 1 year ago

Merged