elastic / elasticsearch

Free and Open Source, Distributed, RESTful Search Engine
https://www.elastic.co/products/elasticsearch
Other
1.49k stars 24.89k forks source link

"nested" fields are not handled correctly when pushed down #117617

Open astefan opened 4 days ago

astefan commented 4 days ago

Description

test1 index

{
    "mappings": {
      "properties": {
        "address": {
          "properties": {
            "city": {
              "type": "nested",
              "properties": {
                "name": {
                  "type": "keyword"
                }
              }
            }
          }
        }
      }
    }
}

test2 index

{
    "mappings": {
      "properties": {
        "address": {
          "properties": {
            "city": {
              "type": "keyword",
              "fields": {
                "name": {
                  "type": "keyword"
                }
              }
            }
          }
        }
      }
    }
}

With test data as

{"index":{"_index":"test1","_id":1}}
{"address.city.name":"Paris"}
{"index":{"_index":"test2","_id":1}}
{"address.city":"London"}

For query from test* | sort address.city.name this exception is returned:

        "root_cause": [
            {
                "type": "query_shard_exception",
                "reason": "it is mandatory to set the [nested] context on the nested sort field: [address.city.name].",
                "index_uuid": "4h2St0xZThGwj08T7y75Yw",
                "index": "test1",
                "stack_trace": "[test1/4h2St0xZThGwj08T7y75Yw] org.elasticsearch.index.query.QueryShardException: it is mandatory to set the [nested] context on the nested sort field: [address.city.name].\r\n\tat org.elasticsearch.server@9.0.0-SNAPSHOT/org.elasticsearch.search.sort.FieldSortBuilder.validateMissingNestedPath(FieldSortBuilder.java:657)\r\n\tat org.elasticsearch.server@9.0.0-SNAPSHOT/org.elasticsearch.search.sort.FieldSortBuilder.nested(FieldSortBuilder.java:528)\r\n\tat org.elasticsearch.server@9.0.0-SNAPSHOT/org.elasticsearch.search.sort.FieldSortBuilder.build(FieldSortBuilder.java:351)\r\n\tat org.elasticsearch.server@9.0.0-SNAPSHOT/org.elasticsearch.search.sort.SortBuilder.buildSort(SortBuilder.java:164)\r\n\tat org.elasticsearch.xpack.esql.planner.EsPhysicalOperationProviders$DefaultShardContext.buildSort(EsPhysicalOperationProviders.java:276)\r\n\tat org.elasticsearch.compute.lucene.LuceneTopNSourceOperator$PerShardCollector.<init>(LuceneTopNSourceOperator.java:230)\r\n\tat org.elasticsearch.compute.lucene.LuceneTopNSourceOperator.collect(LuceneTopNSourceOperator.java:148)\r\n\tat org.elasticsearch.compute.lucene.LuceneTopNSourceOperator.getCheckedOutput(LuceneTopNSourceOperator.java:131)\r\n\tat org.elasticsearch.compute.lucene.LuceneOperator.getOutput(LuceneOperator.java:116)\r\n\tat org.elasticsearch.compute.operator.Driver.runSingleLoopIteration(Driver.java:258)\r\n\tat org.elasticsearch.compute.operator.Driver.run(Driver.java:189)\r\n\tat org.elasticsearch.compute.operator.Driver$1.doRun(Driver.java:378)\r\n\tat org.elasticsearch.server@9.0.0-SNAPSHOT/org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:27)\r\n\tat org.elasticsearch.server@9.0.0-SNAPSHOT/org.elasticsearch.common.util.concurrent.TimedRunnable.doRun(TimedRunnable.java:34)\r\n\tat org.elasticsearch.server@9.0.0-SNAPSHOT/org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:1023)\r\n\tat org.elasticsearch.server@9.0.0-SNAPSHOT/org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:27)\r\n\tat java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)\r\n\tat java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)\r\n\tat java.base/java.lang.Thread.run(Thread.java:1575)\r\n"
            }
        ],

nested fields are currently not supported in ES|QL and the language ignores them completely. Is as if they do not exist. This is also reflected in how we create and analyze the result of the _field_caps response. In the scenario above, two fields with the same name and same hierarchy path should be handled differently: one pushed down as is (because they are supported), the other (nested one) "ignored" somehow.

elasticsearchmachine commented 4 days ago

Pinging @elastic/es-analytical-engine (Team:Analytics)