elastic / elasticsearch

Free and Open Source, Distributed, RESTful Search Engine
https://www.elastic.co/products/elasticsearch
Other
1.12k stars 24.83k forks source link

[ES|QL] Column verification happens before index filtering #113093

Open dgieselaar opened 1 month ago

dgieselaar commented 1 month ago

In ES|QL, column verification happens before index filtering. For instance, if I exclude the frozen tier or add a time range filter, ES|QL will still complain about column type mismatches in indices that do not have data for the query. This is unhelpful in cases where e.g. data streams have been rolled over with type changes and old backing indices break a query that only hits recent data. Here's a reprod:

PUT my-index-old
{
  "mappings": {
    "properties": {
      "@timestamp": {
        "type": "date"
      },
      "keyword_or_number": {
        "type": "keyword"
      }
    }
  }
}

PUT my-index-new
{
  "mappings": {
    "properties": {
      "@timestamp": {
        "type": "date"
      },
      "keyword_or_number": {
        "type": "byte"
      }
    }
  }
}

POST my-index-old/_doc
{
  "@timestamp": "2024-01-01T00:00:00.000Z",
  "keyword_or_number": "my_text"
}

POST my-index-new/_doc
{
  "@timestamp": "2024-09-18T09:19:16.409Z",
  "keyword_or_number": 10
}

// returns only keyword_or_number of type byte

POST my-index*/_field_caps?fields=*&filter_path=fields.@timestamp*,fields.keyword_or_number*
{
  "index_filter": {
    "range": {
      "@timestamp": {
        "gte": "2024-02-01T00:00:00.000Z"
      }
    }
  }
}

// breaks with a `verification_exception`

POST _query
{
  "query": """
    FROM my-index*
      | STATS BY keyword_or_number
  """,
  "filter": {
    "range": {
      "@timestamp": {
        "gte": "2024-02-01T00:00:00.000Z"
      }
    }
  }
}

// "just works"

POST my-index*/_search
{
  "query": {
    "bool": {
      "filter": {
        "range": {
          "@timestamp": {
            "gte": "2024-02-01T00:00:00.000Z"
          }
        }
      }
    }
  },
  "aggs": {
    "group": {
      "terms": {
        "field": "keyword_or_number"
      }
    }
  }
}
elasticsearchmachine commented 1 month ago

Pinging @elastic/es-analytical-engine (Team:Analytics)