elastic / elasticsearch

Free and Open Source, Distributed, RESTful Search Engine
https://www.elastic.co/products/elasticsearch
Other
69.95k stars 24.74k forks source link

ESQL: Arithmetic between unsigned long and long leads to null values due to implicit casting #102935

Closed craigtaverner closed 3 months ago

craigtaverner commented 10 months ago

Elasticsearch Version

8.11, 8.12

Installed Plugins

No response

Java Version

bundled

OS Version

all

Problem Description

When ESQL does arithmetic with long and unsigned long values, it implicitly casts to unsigned long, which leads to null results (and warnings) under a number of situations which are potentially common:

All of the above cases will lead to null results, but if we change the implicit case to long instead, all of the above will lead to valid results, except when:

This case is presumed to be far less likely with real world data than the other cases above, and therefor implicit casting to long is of higher utility than implicit casting to unsigned long.

And of course the user is free to do explicit casting if they desire a different behaviour.

Steps to Reproduce

Do any of the following:

Logs (if relevant)

No response

elasticsearchmachine commented 10 months ago

Pinging @elastic/es-ql (Team:QL)

elasticsearchmachine commented 10 months ago

Pinging @elastic/elasticsearch-esql (:Query Languages/ES|QL)

not-napoleon commented 10 months ago

I'm actually not sure we support it at all. I just tried the query

row a = to_long(59), b = to_unsigned_long(101)
| eval x = a + b
| keep x

and got this error:

[esql] > Unexpected error from Elasticsearch: verification_exception - Found 1 problem
line 2:12: first argument of [a + b] is [long] and second is [unsigned_long]. [unsigned_long] can only be operated on together with another [unsigned_long]
bpintea commented 10 months ago

I think unless we have more usage data, either casting direction is a good start. Just raising below some notes on the argumentations.

All of the above cases will lead to null results, but if we change the implicit case to long instead, all of the above will lead to valid results

I'm wondering how accurate this is. I guess this is as hard to evaluate with any precision, as to estimate the values the fields of any numeric type can take. The assumption here might be that the UL fields will preponderantly take values greater than Long.MAX_VALUE (which a Long can't), but is this assumption founded? If these values are, for instance, counters - which I assume they are -, their values' distribution would be equal.

And assuming we do the conversion of UL to long by shifting the range [0, UL_MAX] -> [Long.MIN_VALUE, Long.MAX_VALUE] (and not just simply truncating) and keeping the exact math operations (which we do now for arithmetic operators), the "probability" of a null value is not much different. Some examples:

  • Add a negative long to an unsigned long

banal -2L + 1UL -> null (long underflow)

  • Multiply a negative long to an unsigned long leading to a negative result

This has the same "probably" of a null (due to overflow) as multiplying any two longs.

  • Subtract an unsigned long from a long leading to a negative result

I guess this assumes the UL value is larger than long's, on the positive range, smth like 1L - 2UL? This will be "safer", but wondering if restricting the ranges of the operands is a reasonable exercise. (Othewise , -3L - 2UL -> null).

elasticsearchmachine commented 9 months ago

Pinging @elastic/es-analytics-geo (Team:Analytics)

alex-spies commented 9 months ago

For completeness' sake: In addition to arithmetics, I think we should also allow binary comparisons. Currently to_ul(1) < 2.0 fails due to 2.0 being a double.

elasticsearchmachine commented 8 months ago

Pinging @elastic/es-analytical-engine (Team:Analytics)

nik9000 commented 3 months ago

At this point operations between unsigned_long and other numeric fail at compile time - you must manually cast to or from unsigned_long. That's fine for now. It's not as good as lovely auto-casting, but it'll do for now.

craigtaverner commented 3 months ago

Just to verify that what @not-napoleon said above, I tested with both literals and fields:

literals

Query:

ROW a=TO_LONG(-2), b=TO_ULONG(1) | EVAL c = a + b

results in:

verification_exception
    Found 1 problem
        line 1:45: [+] has arguments with incompatible types [long] and [unsigned_long]

fields

PUT /test
{
  "mappings":{
    "properties": {
      "a":{"type":"long"},
      "b":{"type":"unsigned_long"}
    }
  }
}

POST /test/_doc
{
  "a":-2,
  "b":1
}

GET /test/_search

POST /_query?error_trace=true&format=txt
{
  "query": "FROM test | EVAL c = a + b"
}

Results in:

verification_exception
    Found 1 problem
        line 1:22: [+] has arguments with incompatible types [long] and [unsigned_long]