With the introduction of PR https://github.com/apache/lucene/pull/12156 we saw degradation in performance of bool queries where one of the mandatory clauses is a TermInSetQuery with query terms not present in the field. Before for such cases TermsInSetQuery returned null for ScoreSupplier which would shortcut the whole bool query.
We need to either incorporate that change as part of Lucene 9.12, or patch Elasticsearch so we include that until the Lucene change is included in ES.
Steps to Reproduce
Queries with a high number of terms in terms queries take a long time in build_scorer as reported by query profiling, specifically when those terms are not present in the field.
Elasticsearch Version
8.8 or higher
Installed Plugins
No response
Java Version
bundled
OS Version
Linux
Problem Description
With the introduction of PR https://github.com/apache/lucene/pull/12156 we saw degradation in performance of bool queries where one of the mandatory clauses is a TermInSetQuery with query terms not present in the field. Before for such cases TermsInSetQuery returned null for ScoreSupplier which would shortcut the whole bool query.
This has been fixed in Lucene with https://github.com/apache/lucene/pull/13454, but has not yet been included into Elasticsearch. This PR has not made it to Lucene 9.11.
We need to either incorporate that change as part of Lucene 9.12, or patch Elasticsearch so we include that until the Lucene change is included in ES.
Steps to Reproduce
Queries with a high number of terms in
terms
queries take a long time inbuild_scorer
as reported by query profiling, specifically when those terms are not present in the field.Logs (if relevant)
No response