mozilla / elasticutils

[deprecated] A friendly chainable ElasticSearch interface for python
http://elasticutils.rtfd.org
BSD 3-Clause "New" or "Revised" License
243 stars 76 forks source link

support function scoring #233

Open robhudson opened 10 years ago

robhudson commented 10 years ago

Boosting via the special _boost field has been removed in >= 1.0 and the suggested alternative is to use function scoring. http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/_deprecations.html

I propose we support a function scoring in the following way. Feedback welcome.

Add a method to the S called function_score. This method will take a query in the form of a Q-type object, along with other arguments to define the function scoring, such as boost, score_mode, boost_mode and function(s).

For example:

qs = S()
qs = qs.function_score(
    query=Q(title__match='shoes'),
    function={'script_score': {'script': "_score * doc['weight'].value"}},
)
qs = qs.filter(...)
...

Any calls to .query(...) on this would set the query outside of the function score. The function arg could also take a list for the case of passing a list of functions.

Feel free to bikeshed on the naming of these. I can work on the implementation and adjust naming after the fact to make a nicer API. There's so many variables to function scoring it seems hard to nail down a nice API.

willkg commented 10 years ago

(I was talking with Rob about this earlier.)

Another idea was to create a FSQ class that took these arguments and would get passed into .query().

qs = S()
fsq = FSQ(
    query=Q(title__match='shoes'),
    function={'script_score': {'script': '_score * doct[\'weight\'].value'}}
)
qs = qs.query(fsq)

Part of me thinks it's nice to have all queries go in .query() or .query_raw() and not add additional query-section methods.

willkg commented 10 years ago

There are two requirements here:

  1. provide a replacement for "boost" for Elasticsearch 1.0 users
  2. provide support for function score queries

The first one is important right now. The second one I vote we push off to a new version of ElasticUtils after we've ditched support for Elasticsearch 0.90 and prior.

Given that, I'm re-focusing this bug to just cover the first item.

willkg commented 10 years ago

I screwed up and misread things. The existing .boost() will work fine with ES 1.0. The thing that doesn't work fine is if you've got a field in your document that defines the boost for that document. But ElasticUtils doesn't do anything index-related really.

Given that, I'm re-scoping this to "support function_score for Elasticsearch 1.0+" and pushing it out of the 0.10 milestone because we don't need it for bridging people from Elasticsearch 0.90 to 1.0.

pcompassion commented 9 years ago

+1

Is there a way to do function_score using some kind of raw mechanism until it gets implemented in elasticutils?

e.g. is it possible to construct query in elasticutils, and modify it and pass to pyelasticsearch ?

patrick91 commented 9 years ago

That's doable. I did something like this a couple of weeks ago, you can use the build_search method in the S class:

some_s = S()
print some_s.build_search()
willkg commented 9 years ago

Two things:

  1. elasticutils doesn't use pyelasticsearch anymore--it uses elasticsearch-py (crazy library names)
  2. I haven't had time to work on elasticutils in a while and I don't see that changing any time soon, but if anyone wants to submit a pull request, I'm definitely interested
pcompassion commented 9 years ago

patrick91 That's good news! Can you give me a little more direction?

Did you subclass S class to override build_search (to support function scoring) ?
Did you modify what you got from build_search and send it to elasticsearch-py?

patrick91 commented 9 years ago

@pcompassion no, I did something like this:

some_s = S()

# a couple of queries here

# then we get the raw search dict
raw = some_s.build_search()

# do a couple of changes here
# like raw['XXX'] = { ... }

# and finally query again using the changed stuff

res = some_s.query_raw(raw)
pcompassion commented 9 years ago

@patrick91: thanks for the prompt response!