opensearch-project / OpenSearch

🔎 Open source distributed and RESTful search engine.
https://opensearch.org/docs/latest/opensearch/index/
Apache License 2.0
9.06k stars 1.68k forks source link

[BUG] Composite aggregation 'after_key' reference with a date_histogram in different time zones #7479

Open sasin-jonas opened 1 year ago

sasin-jonas commented 1 year ago

Describe the bug When using the composite aggregation with a date_histogram, time-zone offsets used by the 'after_key' reference are not handled correctly. Identified version: 1.3.9

To Reproduce Steps to reproduce the behavior:

  1. Create an index with example data: data.csv; schema: csv-random-idx-scheme.txt
  2. Use composite aggregation with some timeZone other than the UTC
  3. Example requests: request1.txt request2.txt
  4. The first request returns an after_key (combination of columns of last aggregation record). This after_key is used as the 'after' parameter in the second request.
  5. The second request's result contains duplicated or skipper records (depending on the time zone used) - The after_key's timestamp reference is not adjusted based on the time zone. E.g., when using timezone with UTC+5, the second request will contain results starting with a timestamp that is 5 hours before the timestamp of the first aggregation's last record.
  6. When cycling through the paged result using the 'after_key', the total number of aggregation results differ based on the time zone used (only UTC returns the correct results)

Expected behavior OpenSearch should be able to consider the timezone of the after_key timestamp reference

peternied commented 1 year ago

Thanks for the detailed issue report @sasin-jonas I'm marking this issue as triaged.