opensearch-project / OpenSearch

🔎 Open source distributed and RESTful search engine.
https://opensearch.org/docs/latest/opensearch/index/
Apache License 2.0
9.53k stars 1.75k forks source link

[Feature Request] Start of week setting for date histogram aggregations #14816

Open jjfalk opened 2 months ago

jjfalk commented 2 months ago

Is your feature request related to a problem? Please describe

Some of our date histogram aggregations require calendar aware weekly interval with start of week at Sunday - instead of Monday, which is always assumed by Opensearch. We tried to use offset of "-1d" to get around this, however this offset ignores time zones. What happens is if the results cross DST change weekend, we end up with incorrect result. Example:

Describe the solution you'd like

Either:

  1. A possibility to alter start of week / set it to Sunday,
  2. An option to use calendar and timezone aware offset.

Related component

Search:Aggregations

Describe alternatives you've considered

No response

Additional context

No response

finnegancarroll commented 1 month ago

Hi @jjfalk! If I understand correctly the -1d interval is causing a timestamp which previously fell within DST to exit DST. This results in the shifted timestamp losing an additional hour which ultimately changes the weekday/aggregation bucket? Is data being ingested or queried with a specified time_zone?

jjfalk commented 1 month ago

Hi @finnegancarroll, yes, your description is pretty much correct. Data is provided ingested in UTC, while aggregation request has relevant timezone ID provided.

finnegancarroll commented 2 weeks ago

The core issue here seems to be that the offset field of a date_histogram is always fixed. That is any offset is immediately converted into milliseconds and is eventually used as the offset for an OffsetRounding which the date histogram aggregator uses to determine bucket boundaries.

This initial conversion of offset to ms is a misstep since not every day is the same length. I think the fix here, which will support the above feature, is to ingest the offset as a DateTimeUnit. Once we are able to dynamically lookup interval lengths we can support the 'non-fixed' day, week, month, year offsets.