elastic / elasticsearch

Free and Open Source, Distributed, RESTful Search Engine
https://www.elastic.co/products/elasticsearch
Other
837 stars 24.8k forks source link

Getting the nested objects in sorted order. #17067

Closed himaniJoshi closed 8 years ago

himaniJoshi commented 8 years ago

Hello, I first want to declare that I have been introduced to the world of elasticsearch just a week back, so there could be some obvious points I might be missing while addressing my problems.

However, there is one particular problem, I just can't seem to find a solution to, after exhaustively trying to find and answer, so I would like some help here.

I have created a dummy index to explain my problem.

The mapping is :

{"mappings": {
"person" : {
  "_source": {
    "enabled": false
  }, 
  "properties": {
    "name" :  {
      "type" : "string"
    },
    "age" :{
      "type"  :"long"
    },
    "city" : {
      "type" : "string"
    },
    "attributes" : {
      "type": "nested",
      "include_in_all": true,
      "properties": {
        "attributeName": {
          "type" : "string"
        },
        "id"  :{
          "type" : "string",
          "store" :  true
        },
        "rating" : {
          "type" : "long"
        }
      }
    }
  }
}}}

'attributes' is nested, and we are storing the 'attributes.id'

I populated this with following entries:

{"name" : "abc1",
"age" : 21,
"city" : "Delhi",
"attributes" : [
  {
    "id" : 1,
    "rating"  : 5,
    "attributeName"  : "at1"
  },
  {"id" : 2,
    "rating" : 7,
    "attributeName"  : "at2"
  }
]}

 {"name" : "abc2",
 "age" : 24,
 "city" : "Mumbai",
 "attributes" : [
  {
    "id" : 4,
    "rating"  : 4,
    "attributeName"  : "at4"
  },
  {"id" : 3,
    "rating" : 8,
    "attributeName"  : "at3"
  }
]}

 {"name" : "abc3",
 "age" : 26,
 "city" : "Delhi",
 "attributes" : [
  {
    "id" : 5,
    "rating"  : 9,
    "attributeName"  : "at5"
  },
  {"id" : 6,
    "rating" : 6,
    "attributeName"  : "at6"
  }
]}

Now my problem required me to make a query that applies some filter on city, and some filter on attributes.rating, and gives me back just the attribute ids. There doesn't seem to be an exact solution for this, but I have been able to get it working by using inner_hits and adding id to the inner_hits' fields. Like this:

{"query" :{
          "bool": {
            "must": [
               {
                "match" : {
                  "city" : "Delhi"
                }
              },
              {
                "nested": {
                  "path": "attributes",
                  "query": {
                    "bool": {
                      "must": [
                       {
                          "range": {
                            "attributes.rating": {
                              "gt": 5
                            }
                          }
                        }
                      ]
                    }
                  },
                  "inner_hits" : {
                      "fields"  : ["attributes.id"]
                  }
                }
              }
            ]
          }
        }}

This gives me the following response:

{"took": 9,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 2,
"max_score": 1.724915,
"hits": [
  {
    "_index": "test",
    "_type": "person",
    "_id": "1",
    "_score": 1.724915,
    "inner_hits": {
      "attributes": {
        "hits": {
          "total": 1,
          "max_score": 1,
          "hits": [
            {
              "_index": "test",
              "_type": "person",
              "_id": "1",
              "_nested": {
                "field": "attributes",
                "offset": 1
              },
              "_score": 1,
              "fields": {
                "attributes.id": [
                  "2"
                ]
              }
            }
          ]
        }
      }
    }
  },
  {
    "_index": "test",
    "_type": "person",
    "_id": "3",
    "_score": 1.724915,
    "inner_hits": {
      "attributes": {
        "hits": {
          "total": 2,
          "max_score": 1,
          "hits": [
            {
              "_index": "test",
              "_type": "person",
              "_id": "3",
              "_nested": {
                "field": "attributes",
                "offset": 1
              },
              "_score": 1,
              "fields": {
                "attributes.id": [
                  "6"
                ]
              }
            },
            {
              "_index": "test",
              "_type": "person",
              "_id": "3",
              "_nested": {
                "field": "attributes",
                "offset": 0
              },
              "_score": 1,
              "fields": {
                "attributes.id": [
                  "5"
                ]
              }
            }
          ]
        }
      }
    }
  }
]}}

The second part of my problem requires me to return the attribute id in sorted order. So from above query how can I get attribute.id, sorted by rating in ascending order. This means that i should get Ids in sequence 6, 2, 5. However, since in my response I get the parent, not the nested doc, and since 5 and 6 are in one parent, and 2 in another, does my problem even have a solution?

jasontedor commented 8 years ago

GitHub is reserved for feature requests and bug reports; general questions are best addressed on the Elastic Discourse forums.