o19s / elasticsearch-learning-to-rank

Plugin to integrate Learning to Rank (aka machine learning for better relevance) with Elasticsearch
http://opensourceconnections.com/blog/2017/02/14/elasticsearch-learning-to-rank/
Apache License 2.0
1.48k stars 370 forks source link

Script evaluated differently in LTR? #24

Closed peterdm closed 7 years ago

peterdm commented 7 years ago

I have a sample painless script which compares two static dates. If I run it inside script_fields I see the value I expect. However if I run it as a script query input to LTR, I get 1.

Baffled!

Test Setup:

POST _scripts/ranklib/dummy
{
  "script": "## LambdaMART\n## No. of trees = 1\n## No. of leaves = 10\n## No. of threshold candidates = 256\n## Learning rate = 0.1\n## Stop early = 100\n\n<ensemble>\n <tree id=\"1\" weight=\"0.1\">\n  <split>\n   <feature> 1 </feature>\n   <threshold> 0.45867884 </threshold>\n   <split pos=\"left\">\n    <feature> 1 </feature>\n    <threshold> 0.0 </threshold>\n    <split pos=\"left\">\n     <output> -2.0 </output>\n    </split>\n    <split pos=\"right\">\n     <output> -1.3413081169128418 </output>\n    </split>\n   </split>\n   <split pos=\"right\">\n    <feature> 1 </feature>\n    <threshold> 0.6115718 </threshold>\n    <split pos=\"left\">\n     <output> 0.3089442849159241 </output>\n    </split>\n    <split pos=\"right\">\n     <output> 2.0 </output>\n    </split>\n   </split>\n  </split>\n </tree>\n</ensemble>"
}  

POST /test/empty/
{}

GET /test/empty/_search
{
  "query": {
    "match_all": {}
  }
}

Query to Reproduce Error:

GET /test/empty/_search?explain=true
{
  "query": {
      "match_all": {}
  },
  "script_fields": {
      "days_between": {
          "script": {
              "params": {
                  "search_timestamp": "2017-03-23T00:00:00.000Z",
                  "compare_to": "2017-03-18T04:34:15.606Z"
              },
              "lang": "painless",
              "inline": "return ChronoUnit.DAYS.between(Instant.parse(params.compare_to), Instant.parse(params.search_timestamp))"
          }
      }
  },
  "rescore": {
      "query": {
          "rescore_query": {
              "ltr": {
                  "model": {
                      "stored": "dummy"
                  },
                  "features": [
                      {
                          "script": {
                              "_name": "days_between",
                              "script": {
                                "params": {
                                    "search_timestamp": "2017-03-23T00:00:00.000Z",
                                    "compare_to": "2017-03-18T04:34:15.606Z"
                                },
                                "lang": "painless",
                                "inline": "return ChronoUnit.DAYS.between(Instant.parse(params.compare_to), Instant.parse(params.search_timestamp))"
                            }
                          }
                      }
                  ]
              }
          }
      }
  }
}

For me (5.3.0_0.1.0) the result comes out:

fields.days_between = [ 4 ] _explaination.details[1].details[0].details[0].details[0].value = 1

softwaredoug commented 7 years ago

@peterdm

A script query doesn't actually return a score (it always returns 1 or a 0). I think this is similar to the other issue we discussed. A function_score_query appears to work

GET /test/empty/_search?explain=true
{
   "query": {
      "match_all": {}
   },
   "script_fields": {
      "days_between": {
         "script": {
            "params": {
               "search_timestamp": "2017-03-23T00:00:00.000Z",
               "compare_to": "2017-03-18T04:34:15.606Z"
            },
            "lang": "painless",
            "inline": "return ChronoUnit.DAYS.between(Instant.parse(params.compare_to), Instant.parse(params.search_timestamp))"
         }
      }
   },
   "rescore": {
      "query": {
         "rescore_query": {
            "ltr": {
               "model": {
                  "stored": "dummy"
               },
               "features": [
                  {
                     "function_score": {
                        "_name": "days_between",
                        "functions": [
                           {
                              "script_score": {
                                 "script": {
                                    "params": {
                                       "search_timestamp": "2017-03-23T00:00:00.000Z",
                                       "compare_to": "2017-03-18T04:34:15.606Z"
                                    },
                                    "lang": "painless",
                                    "inline": "return ChronoUnit.DAYS.between(Instant.parse(params.compare_to), Instant.parse(params.search_timestamp))"
                                 }
                              }
                           }
                        ]
                     }
                  }
               ]
            }
         }
      }
   }
}
softwaredoug commented 7 years ago

@peterdm safe to close?

peterdm commented 7 years ago

Yup. Thanks.