logstash-plugins / logstash-output-influxdb

Apache License 2.0
58 stars 79 forks source link

Enhancement Request - automatic time-bucket conflict resolution #57

Open PauloAugusto-Asos opened 7 years ago

PauloAugusto-Asos commented 7 years ago

Enhancement Request

Requesting that the plugin automatically resolves time-bucket conflicts.

If we send 2 or more data points to the same "series" with the same timestamp, ex:

InfluxDB will just overwrite all the data points with the last one received. This is quite likely to happen in high traffic websites, where you'll have the same server respond to more than 1 equal request in a second, while storing the request/response time with only the granularity of Second.

Proposal:

output { _ _ influxdb { _ _ _ _ time_conflict_resolver => "AddMillisecond" changes the timespamps 12:34:56, 12:34:56, 12:34:56 To: 12:34:56.001, 12:34:56.002, 12:34:56.003

_ _ _ _ time_conflict_resolver => "AddMicrosecond" Same but at the level of Microsecond. Potentially also the same but at the level of Nanosecond.

_ _ _ _ time_conflict_resolver => "AddNewTag" _ _ _ _ time_conflict_resolving_tag => "qwerty" Adds the following InfluxDB "tags" to only and each conflicting datapoint: qwerty=1, qwerty=2, qwerty=3 This one creates new series but it's my favorite, as it doesn't changes the timestamp.

PauloAugusto-Asos commented 7 years ago

And just to confirm, indeed I can see strong signs that web access logs are being missed. I'm getting for each server consistently 4 entries every second:

While at least a server peaked at ~150 similar requests in the same second in its access logs. A huge amount of logs is getting lost due to the time-bucket conflict.

PauloAugusto-Asos commented 7 years ago

I was losing plenty of requests due to Time-Bucket conflicts and even though it is still a suitable situation for many needs, it didn't suit my needs so I had to go back to Outputting to ElasticSearch instead :(.

I really hope you guys are able to improve this so I can get back to outputting to InfluxDB - after changing the output from InfluxDB to ElasticSearch it's now chewing through the server's disk space like crazy... :(