elastic / elasticsearch

Free and Open Source, Distributed, RESTful Search Engine
https://www.elastic.co/products/elasticsearch
Other
1.36k stars 24.87k forks source link

First SortMode for Nested Fields #33592

Closed erayarslan closed 6 years ago

erayarslan commented 6 years ago

Describe the feature: First SortMode for Nested Fields

I want to sort documents by attribute of selected inner hit. For example I have a mapping and two documents(red and blue) like these:

PUT test_index
{
  "mappings": {
    "mydoc": {
      "properties": {
        "name": {
          "type": "keyword"
        },
        "myNestedList": {
          "type": "nested",
          "properties": {
            "subName": {
              "type": "integer"
            },
            "price": {
              "type": "integer"
            },
            "sortId": {
              "type": "integer"
            }
          }
        }
      }
    }
  }
}
PUT test_index/mydoc/1
{
  "name": "red",
  "myNestedList": [
    {
      "name": "red_1",
      "price": 10,
      "sortId": 1
    },
    {
      "name": "red_2",
      "price": 25,
      "sortId": 2
    }
  ]
}
PUT test_index/mydoc/2
{
  "name": "blue",
  "myNestedList": [
    {
      "name": "blue_1",
      "price": 15,
      "sortId": 2
    },
    {
      "name": "blue_2",
      "price": 5,
      "sortId": 3
    }
  ]
}

Normally when I sort by price descending, it first bring blue because red_2 has value 25 and blue_1 has value 15. I did this and this is fine.

And i tried sorting with scripts but when i use nested filter it fails.

What I want to do is, when sorting documents by price descending, I want to sort documents according to price of nested document with lowest sortId. In this case I should get first blue then red because blue_1's price 15 is greater than red_1's price 10.

elasticmachine commented 6 years ago

Pinging @elastic/es-search-aggs

yagizdemirsoy commented 6 years ago

+1

jpountz commented 6 years ago

Leaving it open for now since it was marked team-discuss but I don't think we should support this feature. Instead, when the relevant sort value can be known at index time like here, it should be added to the parent document.

jimczi commented 6 years ago

@jpountz the issue with the index time solution is that it doesn't work if you have a nested filter in the query. In this case the relevant sort value cannot be inferred at indexing time. The idea with this first mode is that users can sort their nested documents in a way that makes the most relevant first and then use this mode to select the value from the first matching children. Since we preserve the order of nested documents when we index them, this solution should work fine even if nested filters are present.

jpountz commented 6 years ago

@jimczi Thanks for clarifying, this proposal makes sense to me.

erayarslan commented 6 years ago

Thank you @jimczi for clarification.

Hello @jpountz,

Let me try to explain our problem with more details:

This is a business critical problem for us and this can also be a problem for other e-commerce compaines showing multiple listings on their site. Currently when we sort by ascending or descending, the price of the product changes because elasticsearch sorts according to lowest or highes priced nested document. What we want to do is to always show same listing's price.

{
  "name": "red",
  "myNestedList": [
    {
      "name": "red_1",
      "price": 10,
      "sortId": 1
    },
    {
      "name": "red_2",
      "price": 25,
      "sortId": 2
    },
    {
      "name": "red_3",
      "price": 30,
      "sortId": 3
    },
    {
      "name": "red_4",
      "price": 40,
      "sortId": 4
    },
    {
      "name": "red_5",
      "price": 5,
      "sortId": 5
    } 
  ]
}

When we look at the document above;

If we add it to the parent document at index time, this document will be always sorted according to red_1's price. However, if we apply a nested range filter to price field between 27 and 42, sorting should work according to price of red_3(which has lowest sort id after filter applied). So price that we select for sorting changes dynamically accoring to filters applied. We won't have this flexibility if we put it to parent document at index time.

Currently there is no way to do this kind of sorting. If we had sort mode first, we could sort parent documents according to the price of min priced nested document with a query like this(Nested documents will already be sorted at index time by sortId):

{
  "sort": [
    {
      "myNestedList.price": {
        "mode": "first",
        "order": "asc",
        "nested": {
          "path": "myNestedList"
        }
      }
    }
  ]
}
volkantagal commented 6 years ago

+1

jimczi commented 6 years ago

We discussed this feature request in our internal meeting. We agreed that we want to support this new mode but only for nested sort. The order of nested documents is preserved in the index so a first mode makes sense. However for numeric fields the order is not preserved, multi-values are sorted so the first value is always the smallest. For this reason we need to find a way to restrict this mode to nested sort only.