Multi-term queries that accepts all strings can be very costly since they need to visit the postings of the entire terms dictionary. The query_string query tries to detect this early when parsing and rewrites any field:* into an exists query on the field. We should apply the same logic in multi-term queries in order to reduce the cost of these simple queries.
QL (SQL, EQL) can produce this kind of query easily (field == *) in the translation layer so it's important that we optimize this case consistently.
We could also improve the detection and not rely on string matching by working directly on the automaton produced by each multi-term query. If the minimized automaton accepts all strings (is total) we can rewrite to a simple exists query.
That would allow to detect variations like ** or .*?.*
Multi-term queries that accepts all strings can be very costly since they need to visit the postings of the entire terms dictionary. The
query_string
query tries to detect this early when parsing and rewrites anyfield:*
into an exists query on the field. We should apply the same logic in multi-term queries in order to reduce the cost of these simple queries. QL (SQL, EQL) can produce this kind of query easily (field == *
) in the translation layer so it's important that we optimize this case consistently. We could also improve the detection and not rely on string matching by working directly on the automaton produced by each multi-term query. If the minimized automaton accepts all strings (is total) we can rewrite to a simpleexists
query. That would allow to detect variations like**
or.*?.*