elastic / elasticsearch

Free and Open Source, Distributed, RESTful Search Engine
https://www.elastic.co/products/elasticsearch
Other
1.11k stars 24.83k forks source link

Add 'key' field to function_score query function for explanation retrieval #46007

Open lrynek opened 5 years ago

lrynek commented 5 years ago

When trying to extract current factor value from _explanation part of ElasticSearch JSON response (i.e. for debugging or logging purposes), I can do it only with text matching of a script body (and only with those functions that operates on script language, the filter ones are out of reach). I would add a new field key (or whatever name suits best) to the function_score query functions array items, as follows:

Now

(see: https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-function-score-query.html)

Request

{
  "explain": true,
  "query": {
    "function_score": {
      "query": {
        "match_all": {}
      },
      "functions": [
        {
          "script_score": {
            "script": {
              "lang": "painless",
              "source": "return doc['ids'].containsAll(params.ids) ? 1 : 0;",
              "params": {
                "ids": [1, 2]
              }
            }
          },
          "weight": 65
        },
        {
          "filter": {
            "terms": {
              "location.city_id": [
                "1"
              ]
            }
          },
          "weight": 35
        }
      ],
      "boost_mode": "replace",
      "score_mode": "sum",
      "min_score": 0
    }
  }
}

Response

{
  "took": 35,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 2,
    "max_score": 100.0,
    "hits": [
      {
        "_score": 100.0,
        "_source": {
        },
        "_explanation": {
          "value": 100.0,
          "description": "sum of:",
          "details": [
            {
              "value": 100.0,
              "description": "min of:",
              "details": [
                {
                  "value": 100.0,
                  "description": "function score, score mode [sum]",
                  "details": [
                    {
                      "value": 65.0,
                      "description": "product of:",
                      "details": [
                        {
                          "value": 1.0,
                          "description": "script score function, computed with script:\"Script{type=inline, lang='painless', idOrCode='return doc['ids'].containsAll(params.ids) ? 1 : 0;', options={}, params={ids=[1,2]}\" and parameters: \n{ids=[1,2]}",
                          "details": []
                        },
                        {
                          "value": 65.0,
                          "description": "weight",
                          "details": []
                        }
                      ]
                    },
                    {
                      "value": 35.0,
                      "description": "function score, product of:",
                      "details": [
                        {
                          "value": 1.0,
                          "description": "match filter: location.city_id:{1}",
                          "details": []
                        },
                        {
                          "value": 35.0,
                          "description": "product of:",
                          "details": [
                            {
                              "value": 1.0,
                              "description": "constant score 1.0 - no function provided",
                              "details": []
                            },
                            {
                              "value": 35.0,
                              "description": "weight",
                              "details": []
                            }
                          ]
                        }
                      ]
                    }
                  ]
                }
              ]
            }
          ]
        }
      }
    ]
  }
}

After implementation

Request

{
  "explain": true,
  "query": {
    "function_score": {
      "query": {
        "match_all": {}
      },
      "functions": [
        {
here----->"key": "af59aa50-19f4-45c8-90d2-c1a0b91416e1",
          "script_score": {
            "script": {
              "lang": "painless",
              "source": "return doc['ids'].containsAll(params.ids) ? 1 : 0;",
              "params": {
                "ids": [1, 2]
              }
            }
          },
          "weight": 65
        },
        {
here----->"key": "f4ff6d9e-96d6-401c-8da7-ff99d8228457",
          "filter": {
            "terms": {
              "location.city_id": [
                "1"
              ]
            }
          },
          "weight": 35
        }
      ],
      "boost_mode": "replace",
      "score_mode": "sum",
      "min_score": 0
    }
  }
}

Response

{
  "took": 35,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 2,
    "max_score": 100.0,
    "hits": [
      {
        "_score": 100.0,
        "_source": {
        },
        "_explanation": {
          "value": 100.0,
here----->"key_value_pairs": {
(on-the-root-level)
            "af59aa50-19f4-45c8-90d2-c1a0b91416e1": 65.0,
            "f4ff6d9e-96d6-401c-8da7-ff99d8228457": 35.0
          },
          "description": "sum of:",
          "details": [
            {
              "value": 100.0,
              "description": "min of:",
              "details": [
                {
                  "value": 100.0,
                  "description": "function score, score mode [sum]",
                  "details": [
                    {
or-here-------------->"key": "af59aa50-19f4-45c8-90d2-c1a0b91416e1",
(on-first-computed-distinctive-value-level)
                      "value": 65.0,
                      "description": "product of:",
                      "details": [
                        {
                          "value": 1.0,
                          "description": "script score function, computed with script:\"Script{type=inline, lang='painless', idOrCode='return doc['ids'].containsAll(params.ids) ? 1 : 0;', options={}, params={ids=[1,2]}\" and parameters: \n{ids=[1,2]}",
                          "details": []
                        },
                        {
                          "value": 65.0,
                          "description": "weight",
                          "details": []
                        }
                      ]
                    },
                    {
or-here-------------->"key": "f4ff6d9e-96d6-401c-8da7-ff99d8228457",
(on-first-computed-distinctive-value-level)
                      "value": 35.0,
                      "description": "function score, product of:",
                      "details": [
                        {
                          "value": 1.0,
                          "description": "match filter: location.city_id:{1}",
                          "details": []
                        },
                        {
                          "value": 35.0,
                          "description": "product of:",
                          "details": [
                            {
                              "value": 1.0,
                              "description": "constant score 1.0 - no function provided",
                              "details": []
                            },
                            {
                              "value": 35.0,
                              "description": "weight",
                              "details": []
                            }
                          ]
                        }
                      ]
                    }
                  ]
                }
              ]
            }
          ]
        }
      }
    ]
  }
}

The retrieval of specific computed values will be more precise after such or similar implementation.

elasticmachine commented 5 years ago

Pinging @elastic/es-search

mayya-sharipova commented 5 years ago

@lrynek Thank you for filling the issue. Our current plan is to deprecate a function_score query in favour of script_score query. That's why we don't plan to introduce any enhancements to function_score query including the one you proposed.

I will keep this issue open though, as it suggests an interesting idea for a new query type we are thinking -- a complex boolean query that can combine clauses' scores in multiple ways.

May not be completely relevant to your issue, but ES has a concept of named queries that may help your usecase.

lrynek commented 5 years ago

@mayya-sharipova Thank you for the response and the tip about named queries - it is very useful 🙂. Looking forward to see the new ES compound query, please take into account this feature - it would be perfect to identify particular scoring values in the ES response! 🤞

lrynek commented 3 years ago

@mayya-sharipova Hi! 👋 🙂 Any notice about this feature request? 🙏 Do you mind adding it in previous ES versions maybe (6.x as well)?

lrynek commented 3 years ago

@mayya-sharipova any update on this - it would be awesome to have such a feature // maybe at least at 7.x version? // or it is better to do via a plugin? Thanks for any insight 😊

elasticsearchmachine commented 1 year ago

Pinging @elastic/es-search (Team:Search)

lrynek commented 4 months ago

@mayya-sharipova maybe you can take advantage of the change made in OpenSearch (same concept applied as for _name but for functions / as we discussed at some point about named queries): https://github.com/opensearch-project/OpenSearch/issues/1711

elasticsearchmachine commented 4 months ago

Pinging @elastic/es-search-relevance (Team:Search Relevance)