Closed jordansissel closed 5 years ago
Hi All,
I know this is difficult however wondered if this issue has moved on with the advent of version 6?
Thanks!
Sorry, it hasn't moved.
Any ideas when it will be? I've been recommended ELK and I hit this
No idea. The only thing I can tell is that it won't be fixed in a short term.
Am I right in that issue for dumb people like me is that when I send in "ts": "2017-08-30T14:26:30.9157480Z"
ES converts that to 1504103190915
chops off the last 4 digits, parses that as a date but obviously has missing digits off the millisecond so the sorting/search is not as accurate as expected?
@jchannon what is your use case that requires that precision?
@jpountz maybe 6.x sorted indexes can be the answer to this, or are they using the same precision as the indexed values?
My use case? I have always logged to 6 decimal places and want to keep it that way. I'm astounded that this highly recommended piece of software is so poor on storing/converting dates
Our use case is that we ingest logs kubernetes => fluentd (0.14) => elasticsearch, and logs that are emitted rapidly (anything under a millisecond apart, which is easily done) obviously have no way of being kept in that order when displayed in kibana.
Same issue, we are tracking events that happen within nanosec precision.
Is there any plan to increase it?
Yes, but we need to move from Joda to Java.time in order to do so. See https://github.com/elastic/elasticsearch/issues/27330
I opened bug in Logback as its core interface also preserves data in millisecond resolution so precision is lost even earlier, before ES: https://jira.qos.ch/browse/LOGBACK-1374
It seems that historical java.util.Date
type is the cause of problems is Java world.
Same use case, using kubernetes filebeat elasticsearch stack for log collection, but not having nano second precision is leading to incorrect ordering of logs.
Seems like we need to consider the collectors providing a monotonically increasing counter which records the order in which the logs were collected. Nanosecond precision does not necessarily solve the problem because time resolution might not be nanosecond.
Seriously guys ? This bug is almost 3 years old...
The problem is also that if you try to find a workaround you run into a series of other bugs so there is not even a viable acceptable workaround:
So the only viable workaround seems to be to have an epoch + 2 additional digits which are increased in logstash when the timestamp matches.
Does anyone have found a better approach?
Been storing microseconds since epoch in an number field for 2 years now. Suits our needs but YMMV.
cc @elastic/es-search-aggs
Not all time data is collected using commodity hardware. There is plenty of specialty equipment that collects nanosecond resolution data. Thinking about other applications besides log analysis. Sorting by time is critical, but aggregations over small timeframes is also important. For example, maybe I just want to aggregate some scientific data over a one second window or even over millisecond window.
I have nanosecond resolution data and would love to be able to use ES aggregations to analyze it.
Elasticsearch 7.0 will include a date_nanos
field type that handles nanoseconds sorting precision:
https://github.com/elastic/elasticsearch/pull/37755
Nanoseconds precision field is now a first class citizen that doesn't require two fields to retain precision so I will close this issue, please open new ones if you find bugs or enhancements to make on this new field type.
https://jira.qos.ch/browse/LOGBACK-1374 added Instant getInstant()
to the interface ILoggingEvent
allowing to capture nanotime resolution!
It is in 1.3.0-alpha12. I'll expect to see usage in new appenders.
At present, the 'date' type is millisecond precision. For many log use cases, higher precision time is valuable - microsecond, nanosecond, etc.
The biggest impact of this is during sorting of search results. If you sort chronologically, newest-first, by a date field, documents with the same date will probably be sorted incorrectly (because they match). This is often reported by users seeing events "out of order" when they have the same timestamp. Specific example being sorting by date and seeing events in newest-first order, unless there is a tie, in which case oldest-first (or first-written?) appears. This causes a bit of confusion for the ELK use case.
Related: https://github.com/logstash-plugins/logstash-filter-date/pull/8
I don't have any firm proposals, but I have two different implementation ideas:
now-1h
or doing date_histogram, etc)date
type have configurable precision, with the default (backwards compatible) precision being milliseconds. This would let us choose, for example, nanosecond precision for the logging use case, and year precision for an archaeological use case (billions of years ago, or something). Benefit here is date histogram and other date-related features could still work. Further, having the precision configurable would allow us to keep the underlying data structure a 64bit long and users could choose their most appropriate precision.