elastic / elasticsearch

Free and Open, Distributed, RESTful Search Engine
https://www.elastic.co/products/elasticsearch
Other
69.04k stars 24.51k forks source link

Cannot index document whose date value is equivalent to Java's OffsetDateTime.MIN #81042

Open awalter17 opened 2 years ago

awalter17 commented 2 years ago

Elasticsearch version (bin/elasticsearch --version): Version: 7.10.2, Build: oss/docker/747e1cc71def077253878a59143c1f785afa92b9/2021-01-13T00:42:12.435326Z, JVM: 15.0.1

Plugins installed: []

JVM version (java -version): openjdk 11.0.13 2021-10-19 <- that's the java version on my machine not in the docker container

OS version (uname -a if on a Unix-like system): Linux, Fedora 34

Description of the problem including expected versus actual behavior:

It appears as though elasticsearch cannot index a date whose value is equivalent to OffsetDateTime.MIN. When I try to index a document with such a date field, I get a 400 response with an arithmetic_exception (full response in steps below).

The date String -999999999-01-01T00:00:00+18:00 comes from a Java snippet like this:

OffsetDateTime min = OffsetDateTime.MIN;
DateTimeFormatter formatter = DateTimeFormatter.ofPattern("uuuu-MM-dd'T'HH:mm:ssxxx");
String minString = min.format(formatter);
System.out.println(minString);

Steps to reproduce:

Please include a minimal but complete recreation of the problem, including (e.g.) index creation, mappings, settings, query etc. The easier you make for us to reproduce it, the more likely that somebody will take the time to look at it.

  1. Start ES docker container docker run -p 9200:9200 -p 9300:9300 -e discovery.type=single-node docker.elastic.co/elasticsearch/elasticsearch-oss:7.10.2

  2. Create index with date field, PUT to http://localhost:9200/testing-date with JSON body

    {
    "settings" : {},
    "mappings" : {
        "properties" : {
            "test_date" : {
                "type" : "date",
                "format" : "uuuu-MM-dd'T'HH:mm:ssxxx"
            }
        }
    }
    }

    Response:

    {
    "acknowledged": true,
    "shards_acknowledged": true,
    "index": "testing-date"
    }
  3. Add document to that index whose date is a string representaiton of Java's OffsetDateTime.MIN, POST to http://localhost:9200/testing-date/_doc with JSON body:

    {
    "test_date" : "-999999999-01-01T00:00:00+18:00"
    }

    Response:

    {
    "error": {
        "root_cause": [
            {
                "type": "mapper_parsing_exception",
                "reason": "failed to parse field [test_date] of type [date] in document with id 'zG_jVn0Buy9QhGs3zL4S'. Preview of field's value: '-999999999-01-01T00:00:00+18:00'"
            }
        ],
        "type": "mapper_parsing_exception",
        "reason": "failed to parse field [test_date] of type [date] in document with id 'zG_jVn0Buy9QhGs3zL4S'. Preview of field's value: '-999999999-01-01T00:00:00+18:00'",
        "caused_by": {
            "type": "arithmetic_exception",
            "reason": "long overflow"
        }
    },
    "status": 400
    }

Provide logs (if relevant):

Please let me know if you require any additional information from me.

elasticmachine commented 2 years ago

Pinging @elastic/es-search (Team:Search)

ywelsch commented 2 years ago

Dates in Elasticsearch are internally stored as a single long number representing milliseconds-since-the-epoch, as also documented here: https://www.elastic.co/guide/en/elasticsearch/reference/current/date.html

OffsetDateTime.MIN cannot be represented with a single long value (it's too far back), hence the following failing: OffsetDateTime.MIN.toInstant().toEpochMilli(); // throws java.lang.ArithmeticException: long overflow

Instants can't be represented with a single long value, see Javadocs of Java's Instant class:

The range of an instant requires the storage of a number larger than a long. To achieve this, the class stores a long representing epoch-seconds and an int representing nanosecond-of-second, which will always be between 0 and 999,999,999. The epoch-seconds are measured from the standard Java epoch of 1970-01-01T00:00:00Z where instants after the epoch have positive values, and earlier instants have negative values. For both the epoch-second and nanosecond parts, a larger value is always later on the time-line than a smaller value.

I think that the limitation of not being able to represent such dates is ok, the exception message could be clearer, however, and so I'm keeping this issue open as an enhancement.

elasticsearchmachine commented 1 month ago

Pinging @elastic/es-search-foundations (Team:Search Foundations)