elastic / kibana

Your window into the Elastic Stack
https://www.elastic.co/products/kibana
Other
19.67k stars 8.23k forks source link

[Reporting/CSV Export] Error when paging with an invalid `search_after` value #199850

Open tsullivan opened 1 week ago

tsullivan commented 1 week ago

Description

CSV Export uses a point-in-time context to page through the data needed to export. When retrieving any page of data after the first page, the export mechanism adds a search_after field to the query, which is set to the last sort value from the previous page of results.

If the sort field is absent from some of the indices, the value of the search_after field may automatically become 9223372036854776000. Using this value in the search_after field causes Elasticsearch to return an error:

{
  "error": {
    "root_cause": [
      {
        "type": "parse_exception",
        "reason": "failed to parse date field [9223372036854776000] with format [strict_date_optional_time||epoch_millis]: [failed to parse date field [9223372036854776000] with format [strict_date_optional_time||epoch_millis]]"
      }
    ],
    "type": "search_phase_execution_exception",
    "reason": "all shards failed",
    "phase": "query",
    "grouped": true,
    "failed_shards": [
      {
        "shard": 0,
        "index": "test",
        "node": [redacted],
        "reason": {
          "type": "parse_exception",
          "reason": "failed to parse date field [9223372036854776000] with format [strict_date_optional_time||epoch_millis]: [failed to parse date field [9223372036854776000] with format [strict_date_optional_time||epoch_millis]]",
          "caused_by": {
            "type": "illegal_argument_exception",
            "reason": "failed to parse date field [9223372036854776000] with format [strict_date_optional_time||epoch_millis]",
            "caused_by": {
              "type": "date_time_parse_exception",
              "reason": "Failed to parse with all enclosed parsers"
            }
          }
        }
      }
    ]
  },
  "status": 400
}

Steps to reproduce

  1. Add the setting in kibana.yml: xpack.reporting.csv.scroll.size: 1
  2. Use test data with a sparse timestamp field:

    POST _bulk
    {"index":{"_index":"test","_id":"1"}}
    {"field1":"value1", "timestamp": "2021-06-04T13:50:00.000000001"}
    {"index":{"_index":"test","_id":"2"}}
    {"field1":"value2"}
    {"index":{"_index":"test","_id":"3"}}
    {"field1":"value3"}
    
    POST test/_refresh
  3. Add a data view in Kibana for test that does NOT use a time field.
  4. Search the data in Discover and sort by the timestamp field
  5. Use the sharing options to create a CSV export

Result:

  1. Errors are logged in the server, and the report completes with warnings:
  2. The export contains the first 2 rows of data, but not the 3rd (only the first query with search_after worked)
tsullivan commented 1 week ago

If the sort field is absent from some of the indices, the value of the search_after field may automatically become 9223372036854776000

This has come up in an Elasticsearch issue that has a comment with some context: https://github.com/elastic/elasticsearch/issues/73772#issuecomment-854756782. Presumably, the bug is that the JSON parser is rounding a different value into 9223372036854776000.

elasticmachine commented 1 week ago

Pinging @elastic/appex-sharedux (Team:SharedUX)

tsullivan commented 1 week ago

There is a workaround for this issue:

  1. Search the data in Discover and sort by the timestamp field

This bug only happens when users try to sort by a field that has sparse values. To avoid the issue: users should not sort the data in Discover by a field that has sparse values, or just not select any field to sort by.