In the issue at https://github.com/elastic/elasticsearch/issues/93180 it was noted that when using offset with a calendar_interval, if the offset is longer than the interval, there can be surprising results. This is because the offset is a fixed interval, and adding a fixed interval to a calendar interval will not result in the same starting date for each bucket. For example, adding "offset": "+35d" to "calendar_interval": "month", will move the bucket starting at 2022-01-01 to instead start at 2022-02-04, but the bucket starting at 2022-02-01 will move to 2022-03-07. Note that the starting day is different. Before the offset, the original histogram contained buckets that all started on the same day of the month, but the new one does not.
The use case desired in the original issue was to be able to define financial years and financial quarters in terms of date histogram buckets. Elasticsearch defines the calendar_interval of a quarter as starting on the 1st of the months of January, April, July and October. If we want the financial year to start on the 4th February (and all quarters therefor on the 4th of the respective months), it would be great to specify this with an offset. But the obvious choice of +35d will, as described above, not work.
We need a concept of a calendar offset. One approach to this would be to enhance the offset field to allow using calendar offset specifications. For example, the above use case could be supported with "offset": "+1m+3d", with the m meaning calendar month.
Description
In the issue at https://github.com/elastic/elasticsearch/issues/93180 it was noted that when using
offset
with acalendar_interval
, if the offset is longer than the interval, there can be surprising results. This is because the offset is a fixed interval, and adding a fixed interval to a calendar interval will not result in the same starting date for each bucket. For example, adding"offset": "+35d"
to"calendar_interval": "month"
, will move the bucket starting at2022-01-01
to instead start at2022-02-04
, but the bucket starting at2022-02-01
will move to2022-03-07
. Note that the starting day is different. Before the offset, the original histogram contained buckets that all started on the same day of the month, but the new one does not.The use case desired in the original issue was to be able to define financial years and financial quarters in terms of date histogram buckets. Elasticsearch defines the
calendar_interval
of aquarter
as starting on the 1st of the months of January, April, July and October. If we want the financial year to start on the 4th February (and all quarters therefor on the 4th of the respective months), it would be great to specify this with anoffset
. But the obvious choice of+35d
will, as described above, not work.We need a concept of a calendar offset. One approach to this would be to enhance the
offset
field to allow using calendar offset specifications. For example, the above use case could be supported with"offset": "+1m+3d"
, with them
meaning calendar month.