apache / lucene

Apache Lucene open-source search software
https://lucene.apache.org/
Apache License 2.0
2.59k stars 1.01k forks source link

Allow MultiLeafKnnCollector.greediness to be configurable #13699

Open dungba88 opened 2 weeks ago

dungba88 commented 2 weeks ago

Description

In Lucene 9.10, we introduced a new algorithm for sharing state between segments to speed up the vector search process. This article explains the features quite well, and it looks like the idea of the greediness parameter is that we can tune it to control the latency-recall tradeoff: low value means more relaxed search (higher recall but higher latency, global queue less important) and high value means more restrictive search (lower latency but lower recall, global queue more important).

However the current code doesn't allow the parameter to be tunable: It will always use the default value of 0.9f. I think allow it to be configurable would allow each application to use a value suited to its specific use case and document/query patterns.

As per where user can configure this parameters, I'm thinking of 2 places:

We can make a PR, but would like to hear from the community first.

jpountz commented 2 weeks ago

Lucene has tens of parameters like this one, exposing them all would make our APIs look rather bad. I wonder how you envision users to tune this parameter, would it be good enough if we made it configurable through something like a system property rather than through the API?

msokolov commented 2 weeks ago

Do we have other internal API parameters via system property? I'm wondering why you think that's preferable to adding Java functions?

dungba88 commented 2 weeks ago

I think system property might be prohibitive for configurability. This parameter is a query-time control, but system property inherently enforce a single value for the whole application (and has to be determined at the start). Some use cases that I can see during tuning are:

With that being said, I agree that having too many parameters would also be bad for API brevity. What if we allow configuring this parameter with a separate method after the AbstractKnnVectorQuery is created? And if there are multiple parameters such as in case of AbstractVectorSimilarityQuery, we group them in a single struct-like object (such as TunableParameters; but of course we should try to reduce them as many as possible).

jpountz commented 1 week ago

Do we have other internal API parameters via system property? I'm wondering why you think that's preferable to adding Java functions?

I don't think we do indeed, I'm trying to be creative to avoid the API tax of exposing all these low-level tuning knobs. To my knowledge, most of these knobs are not exposed, e.g. we don't allow users to tell us whether disjunctions should run using MAXSCORE or WAND, we don't allow users to configure the threshold in IndexOrDocValuesQuery, we don't allow users to configure whether filtered vector search queries should use the HNSW graph or perform a brute force search, etc.

Different set of queries may need different value of greediness. For example, for tail queries, which have small lexical match-set, we may want to get more recall here

FWIW I'd be less concerned about exposing a higher-level configuration option on the query, such as a "desired recall" or something along these lines and then make decisions for greediness (and other things) based on this option.