elastic / elasticsearch

Free and Open Source, Distributed, RESTful Search Engine
https://www.elastic.co/products/elasticsearch
Other
984 stars 24.82k forks source link

ESQL - qstr function should only work with available fields #112854

Open ioanatia opened 1 month ago

ioanatia commented 1 month ago

The QSTR function uses the query_string Elasticsearch DSL query and because we do not verify the contents of query argument, the qstr functions breaks some of the assumptions in ES|QL. Because of this we have added in place a few restrictions for qstr to not be available after a KEEP or DROP command, to gate against cases like these:

FROM my-index
| DROP title
| WHERE qstr("title:abc")

or

FROM my-index
| KEEP content
| WHERE qstr("title:abc")

Even if disallowed the use of qstr after a KEEP or DROP command, we still do not have a way to guard against the case where the qstr function queries over unsupported fields in ES|QL. The simplest case of a unsupported field is when there are field conflicts - e.g. title is a keyword in index1, a text field in index2; when querying index1,index2 in ES|QL title is unsupported unless union fields are used. However an ES|QL like FROM index1,index2 | WHERE qstr("title:abc") will actually query over the title field.

A solution that would solve this last problem and would also lift the restrictions of using qstr after DROP/KEEP is to subclass the query_string query in the context of ES|QL and add to it a separate option for available fields. We can then overwrite the mapping lookup methods that return the fields that are available. This way we control what fields the qstr function queries over without a need to parse the actual query input and reimplement a parser for the query input. The available fields will most likely be provisioned to the qstr function at the Analyzer level (TBD).

An additional improvement we can consider making is having qstr work with renamed columns (TBD).

related #112590

elasticsearchmachine commented 1 month ago

Pinging @elastic/es-search-relevance (Team:Search Relevance)