elastic / elasticsearch

Free and Open Source, Distributed, RESTful Search Engine
https://www.elastic.co/products/elasticsearch
Other
1.19k stars 24.85k forks source link

[ES|QL] Automatic Field Widening #116214

Open pickypg opened 1 week ago

pickypg commented 1 week ago

Description

Caused By [es/esql.query] failed: [verification_exception] Found 1 problem line 2:24: Cannot use field [value] due to ambiguities being mapped as [2] incompatible types: [integer] in [test-1], [long] in [test-2]

My cluster began receiving documents that were orders of magnitude larger than we historically received, but the data was accurate. As a result, we increased the type from integer to long, only to uncover some problems with our existing queries relying on the field. This can be easily reproduced without any data:

DELETE /test-*

PUT /test-1
{
  "mappings": {
    "properties": {
      "group": {
        "type": "keyword"
      },
      "value": {
        "type": "integer"
      }
    }
  }
}

PUT /test-2
{
  "mappings": {
    "properties": {
      "group": {
        "type": "keyword"
      },
      "value": {
        "type": "long"
      }
    }
  }
}

In 8.15 and earlier, this will fail:

POST /_query
{
  "query": """FROM test-1,test-2
| STATS sum_stat = SUM(value) BY group
| LIMIT 1"""
}

There appears to be no real solution in 8.14 or earlier. In 8.15 (and so far later), this will succeed (also works with value::long -> TO_LONG(value)):

POST /_query
{
  "query": """FROM test-1,test-2
| STATS sum_stat = SUM(value::long) BY group
| LIMIT 1"""
}

Given that this happened because of a realistic scenario where a field needed larger capacity, it would be nice if ES|QL supported this casting automatically. I broke a production-level query doing what was an innocent change internally (as, internally to our own code, we already processed the field as a long, we just mistakenly mapped it as integer long ago without ever running into the issue). It seems like ES|QL should be capable enough to widen to the necessary type when it's clearly capable of doing so.

elasticsearchmachine commented 6 days ago

Pinging @elastic/es-analytical-engine (Team:Analytics)