Multi-term queries that accepts all strings should rewrite to an exists query

jimczi commented 4 years ago

Multi-term queries that accepts all strings can be very costly since they need to visit the postings of the entire terms dictionary. The query_string query tries to detect this early when parsing and rewrites any field:* into an exists query on the field. We should apply the same logic in multi-term queries in order to reduce the cost of these simple queries. QL (SQL, EQL) can produce this kind of query easily (field == *) in the translation layer so it's important that we optimize this case consistently. We could also improve the detection and not rely on string matching by working directly on the automaton produced by each multi-term query. If the minimized automaton accepts all strings (is total) we can rewrite to a simple exists query. That would allow to detect variations like ** or .*?.*

elasticmachine commented 4 years ago

Pinging @elastic/es-search (:Search/Search)

jimczi commented 4 years ago

cc @elastic/es-ql

elasticsearchmachine commented 3 months ago

Pinging @elastic/es-search-relevance (Team:Search Relevance)

elastic / elasticsearch

Multi-term queries that accepts all strings should rewrite to an exists query #62760