Support for multi-stage ranking, per-shard or global

joshdevins commented 3 years ago

A common, modern approach to ranking is to perform multiple stages of reranking, with progressively more expensive but effective rerankers [1][2][3]. For example a BM25 first-stage will rank all documents as is typical today, then a second stage LTR (learning to rank) reranker could rerank the top-1000, and possibly a third neural reranker would rerank the last top-100. This would be a simple cascade of rerankers whereas today's functionality allows for only a single reranking, and only on the top-k per-shard.

This should also allow a configurable "location" of reranking — you should be able to choose if you want to rerank the top-k per-shard or after collecting the per-shard top-k and rerank a global top-k [4]. This request was recently discussed in https://github.com/elastic/elasticsearch/issues/60946 however closed due to a lack of need, assuming that the global reranking could be done in the application layer. While this is true, there are a couple of possible benefits to reranking in the coordinating node that come to the top of my mind (perhaps there are more, and there are certainly some contrary arguments).

Reranking is typically an embarrassingly parallel task, with the rank score depending only on the query-document pair. As such, the coordinator could send work back to the data nodes or ML nodes (see below) to perform further rescoring on their documents. The job of the coordinator is only to collect the document IDs of the global top-k and to redistribute work.
Make use of machine learning in Elasticsearch for reranking. With the availability of native machine learning inference in Elasticsearch, rerankers could rely on inference to predict a score. This is the foundation of LTR. For this to be a complete LTR feature though, some new loss functions and feature logging would need to be added to make it true LTR. However even without LTR, it could be useful to include inferred values in a reranker.

[1] Learning to Efficiently Rank with Cascades [2] Phased Ranking in Vespa.ai [3] Multi-Stage Document Ranking with BERT [4] Ranking Relevance in Yahoo Search, Section 3.2 Contextual Reranking

elasticmachine commented 3 years ago

Pinging @elastic/es-search (Team:Search)

softwaredoug commented 3 years ago

assuming that the global reranking could be done in the application layer.

I usually see this as an antipattern, IMO, given the tons of results you need to fetch, and the lack of the index statistics and queries you'd want to run. Though I can understand it's simplicity for people that just want their numpy/pandas/model or whatever to just run in an API as expected without translating to Elasticsearch queries.

Make use of machine learning in Elasticsearch for reranking. With the availability of native machine learning inference in Elasticsearch, rerankers could rely on inference to predict a score.

This would be great. Certainly happy to provide any advice here given our experience with the LTR plugin, though @joshdevins I know you have experience with the LTR plugin to.

I'm not sure the overall system you're proposing, you mention a loss function in Elasticsearch, though in my experience the training tends to happen offline. I think what would be ideal to me is that if Elasticsearch could speak some standard model serialization format, and I could get as close to 'deploy my Python code/model to Elasticsearch' as possible. Doesn't need to be python code per-se, but if I trained something in sklearn, I'd want it to just be easily hosted in Elasticsearch with minimal fuss.

Other kinds of second-pass searches

Related topics:

I wish we could explicitly have search phases. Do some action conditional on the number of results (like run a more relaxed or expanded query on 0 results). This is a common pattern of loosening precision based on the query results. It's a bit cumbersome to do from a search service
I wish we could explicitly provide a way to look over the full result set and diversify based on some criteria. Though I suppose there are grouping aggregations that help with this.

joshdevins commented 3 years ago

I'm not sure the overall system you're proposing, you mention a loss function in Elasticsearch, though in my experience the training tends to happen offline. I think what would be ideal to me is that if Elasticsearch could speak some standard model serialization format, and I could get as close to 'deploy my Python code/model to Elasticsearch' as possible.

I'm referring to Elasticsearch's ML capabilities (Data Frame Analytics) which can train models based on data in an index, such as for regression and classification. The ranking loss is just a mention to what we could add to what we can do today. We can however already import XGBoost and LGBM models for example through eland. So if you've trained an LGBM model with a ranking loss, we can do inference in Elasticsearch already (at ingest time only for now, not yet available to a reranker but could be), assuming you have the same features available to the model 😃

DmitryKey commented 3 years ago

I agree with @softwaredoug in the "antipattern" part not only because it is an API leak (reranking is better to be incapsulated and be leveraging the search engine parallelism), but also because it adds another stage where interpretation of scores can be difficult without access to the document metadata and index statistics.

In practice I've seen rare efforts to go beyond what search engine offers, limiting the progress of improving relevancy. When I worked with LTR, I'd be "blocked" by unavailability of more complex, composable, functions for the features. For instance, the multi-term frequencies or a way to smooth them. Making the addition of such functions easier, could make the whole process of building model a real experiment.

And to make it practically useful, adding debug info, like time spent per stage, will help to bring the model to the spotlight and prove its efficiency, as well as giving the tools to compare models.

elasticsearchmachine commented 4 months ago

Pinging @elastic/es-search-relevance (Team:Search Relevance)

elastic / elasticsearch

Support for multi-stage ranking, per-shard or global #70877

Other kinds of second-pass searches