o19s / elasticsearch-learning-to-rank

Plugin to integrate Learning to Rank (aka machine learning for better relevance) with Elasticsearch
http://opensourceconnections.com/blog/2017/02/14/elasticsearch-learning-to-rank/
Apache License 2.0
1.48k stars 370 forks source link

the result score is the _score of LTR +the _score of es ? #201

Closed zhuqingrui closed 5 years ago

nomoa commented 5 years ago

The result score is controlled by elastic and the rescore settings you set.

zhuqingrui commented 5 years ago

Thanks. This is my sltr query. how do i set sltr query if only want get the _score of LTR .

{
  "_source":{"includes":["CBID","authorname","staticscore","title","tag"]
  },
  "query":{
  "function_score":{
    "boost_mode":"sum","field_value_factor":{
      "factor":0.4,"field":"staticscore","missing":0
    },
    "query":{
      "bool":{
        "filter":[{"terms":{"checklevel":[15,10,9]}},{"term":{"auditstatus":19}}],
        "must":{
          "bool":{
            "should":[{"match":{"authorname":{"boost":5.0,"query":"苏还"}}},{"match":{"intro":{"boost":1,"query":"苏还"}}},{"match":{"tag":{"boost":3,"query":"苏还"}}},{"match":{"title":{"boost":2.0,"query":"苏还"}}}]
          }
        }
      }
    }
  }
  },
    "rescore":{
      "query":{
        "rescore_query":{
          "sltr":{
            "model":"hx_LambdaMART_001",
            "params":{
              "keywords":"苏还"
            }
          }
        }
      },
    "window_size":1000},
    "from":0,
    "size":100
}
nomoa commented 5 years ago

Getting only the score of the LTR query not trivial for various reasons using elastic rescore mechanism. Here is the trick we use to keep the LTR ranking but still have consistent ranking when paginating out of the rescore window:

              {
                    "window_size": 1024,
                    "query": {
                        "query_weight": 1,
                        "rescore_query_weight": 10000,
                        "score_mode": "total",
                        "rescore_query": {
                            "bool": {
                                "should": [
                                    {
                                        "constant_score": {
                                            "filter": {
                                                "match_all": {}
                                            },
                                            "boost": 100000
                                        }
                                    },
                                    {
                                        "sltr": {
                                            "model": "20180316-mrmr_enwiki_v1",
                                            "params": {
                                                "query_string": "test"
                                            }
                                        }
                                    }
                                ]
                            }
                        }
                    }
                }

The reason is that elastic does not provide a "replace" mode for the scores so you have to use a hack to simulate this, in the example I shared we over boost the LTR score so that we are sure that min delta between 2 LTR scores is greater that the delta of 2 main query scores. Note that setting query_weight to zero does not work well because as soon as you paginate past the rescore window size you will have random ranking.

zhuqingrui commented 5 years ago

Thank you again.I tried it and found it worked.