Open niemyjski opened 4 years ago
Pinging @elastic/es-analytics-geo (:Analytics/Aggregations)
Heya, thanks for the detailed ticket... and apologies for the confusion/troubles :( I'll start with the easy ones first:
Also of note, it appears when you use kibana it creates a date_histogram via a snippet using interval (legacy).
Correct, although hopefully only temporarily. Kibana, like many user applications, is still migrating off the old interval
. Since date histograms are so widely used, we expect to have a long deprecation period on interval
.
one talks about nanos and micro seconds
Nanoseconds are in a tricky situation right now. They can be indexed and searched (at cost of reduced dynamic range), but aggregations currently only work in millisecond range regardless of the underlying data type. This is noted in the date_nano docs at the bottom. That's why nano support isn't mentioned on the date histogram page.
I'm not sure what the smallest value supported for buckets is, but looking at DateHistogramInterval, it looks like it may be 1s, which can also be supplied as 1000ms input:
These are technically just "helper" methods, and were more useful with the old interval
. Today they just append the correct suffix to the provided numeric value. It does look like we're missing the "milliseconds" helper, but you can manually specify it with:
new DateHistogramInterval("5ms")
1M
This should work for calendar_interval
and interval
, but not fixed_interval
month
This should work for calendar_interval
and interval
, but not fixed_interval
M
This should not work, and I don't think it's ever been supported. E.g. neither of the calendar-aware parsers in 6.0 or 2.0 support just the unit by itself, and fixed intervals don't support calendar concepts like months. I just tested on a 6.0 build and it does indeed throw an exception:
failed to parse setting [DateHistogramAggregationBuilder.interval] with value [M] as a time value: unit is missing or unrecognized
I tested the fixed units too and they also don't work, although from a parsing issue and an ugly exception:
"reason": "failed to parse [s]",
"caused_by": {
"type": "number_format_exception",
"reason": "For input string: \"\""
}
So in conclusion it seems we (ES) has two action items:
milliseconds()
helper method to match the rest of the DateHistogramInterval helpersI hope this helped clear up the situation!
Pinging @elastic/es-docs (Team:Docs)
Elasticsearch version (
bin/elasticsearch --version
): 7.5Plugins installed: ["mapper-size"]
JVM version (
java -version
): docker-imageOS version (
uname -a
if on a Unix-like system): docker-imageDescription of the problem including expected versus actual behavior:
We noticed that aggregations don't behave how we expected them to as per the docs. Please see this issue we logged on nest for more context: https://github.com/elastic/elasticsearch-net/issues/4251
Steps to reproduce:
The docs state that
1M
,month
andM
are valid calendar intervals. However not all of them work correctly when running in ES. This translates to pretty much all the values listed in the documentation (on two different pages, one talks about nanos and micro seconds but the main documentation doesn't)Also of note, it appears when you use kibana it creates a date_histogram via a snippet using interval (legacy).
Provide logs (if relevant):
I doing a
date_histogram
aggregation withcalendar_interval
values of:1M
1M
toDateHistogramAggregationDescriptor<T>.CalendarInterval
month
month
toDateHistogramAggregationDescriptor<T>.CalendarInterval
withExpression 'month' is invalid
M
The supplied interval [M] could not be parsed as a calendar interval.
, but the documentation page says that it is a valid value.M
toDateHistogramAggregationDescriptor<T>.CalendarInterval
withExpression 'M' is invalid
Originally posted by @ejsmith in https://github.com/elastic/elasticsearch-net/issues/4251#issuecomment-564328065
The Date Histogram docs have fixed intervals to milliseconds. The time values docs have values down to nanos.
I'm not sure what the smallest value supported for buckets is, but looking at
DateHistogramInterval
, it looks like it may be1s
, which can also be supplied as1000ms
input:https://github.com/elastic/elasticsearch/blob/6ae6f57d39f473e4968700a28a582b93fe3a3bf4/server/src/main/java/org/elasticsearch/search/aggregations/bucket/histogram/DateHistogramInterval.java#L37-L66
There's something not right; either the documentation should be clearer, or the implementation should also support the single character time units, which it looks like it doesn't
https://github.com/elastic/elasticsearch/blob/23bf310c849c77e50ab87dbb59ea42d9412fe187/server/src/main/java/org/elasticsearch/search/aggregations/bucket/histogram/DateHistogramAggregationBuilder.java#L79-L98
Originally posted by @russcam in https://github.com/elastic/elasticsearch-net/issues/4251#issuecomment-564332921