travelping / exometer_influxdb

Exometer reporter for InfluxDB
Mozilla Public License 2.0
36 stars 31 forks source link

Metric TimeStamp handling #14

Closed hwinkel closed 8 years ago

hwinkel commented 8 years ago

the timestamp for a given metric should be set at the source. Asyou see with batching, unavailable backends etc etc. the receive timestamp of the backenend, in your case influx, will aways differ from the source timstamp. therefore both are important.

for the sample intervals the same is true. there is a difference how often a application updates the exo metrics push from app->exo. and how often the Reporter sends the stats out to the backend.

If we have the intended HTTP GET "reporter" discussed with @uwiger and @GalaxyGorilla the external GET (pull) interval will dictate the reporting interval. If you have a metric type of exometer_function this will also trigger the collection via the callback as far I understand Exometer correctly

surik commented 8 years ago

About timestamp and batch. We can configure both these options. We should set these options very carefully because if we turn off timestamp on reporter side and set huge batching windows we will store measurements on InfluxDB not very precise. I see here the following rules for choosing good values for timestamp and batch window size: 1) If you want to have precise data on backend and want to use batching please use the timestamps on reporter side. Or set batch window size as 0 and use timestamp on backend side but in this case each measurement will be send as one message(not good for performance). 2) If preciseness of data is not important you can use the timestamps on backend. In this case the bigger batch window size the less precisely data you will receive on backend.

About HTTP GET "reporter". I don't understand how it is related to InfluxDB reporter. Do you want to have some HTTP API on application side which gives measurements by GET request? Can you explain more?

hwinkel commented 8 years ago

@surik

1) If you want to have precise data on backend and want to use batching please use the timestamps on reporter side. Or set batch window size as 0 and use timestamp on backend side but in this case each measurement will be send as one message(not good for performance).

I dont know its even on reporter site, as I sill have no glue about the difference of exo_reporter and exo_core and @GalaxyGorilla could not clarify this either in the last discussions. as a rule of thump we have many time stamps while the metric vales are traveling and I we have two important ones:

  1. the Source Timestamp ant the SOURCE of the metric, most important
  2. the Receive Timestamp a backend has received the metric (not relevant for the exporter as the backend will add them anyway as required.

Do you see more important timestamps

hwinkel commented 8 years ago

About HTTP GET "reporter". I don't understand how it is related to InfluxDB reporter. Do you want to have some HTTP API on application side which gives measurements by GET request? Can you explain more?

right, yes this must be moved to the HTTP "reporter" project

surik commented 8 years ago

the Source Timestamp ant the SOURCE of the metric, most important

I think it is about timestamp which goes from exometer_core

the Receive Timestamp a backend has received the metric (not relevant for the exporter as the backend will add them anyway as required.

Please see timestamp part in InfluxDB documentation https://docs.influxdata.com/influxdb/v0.10/write_protocols/line/

But by design we can't use it both in one moment.

exometer_influxdb is just a mediator between exometer_core(initiator of data reporting) and InfluxDB(data storage). exometer_influxdb just represents data from exometer_core to InfluxDB understandable format.

hwinkel commented 8 years ago

I see you point, semantic overloading at best. The final timestamp in mettric should be close to the source as possible, specially if you use a nanosecond precision as documented. From the exo_core perspective this should be in there.

From the current exporter perspective IMHO its OK for now to map it to the %timestamp in the influx protocol as long we know what we doing (README.md) and how we map that

GalaxyGorilla commented 8 years ago

These are the timestamping possibilities the InfluxDB reporter has to offer:

  1. Don't timestamp anything. This means
    • Single mode: every single measurement gets its own timestamp by InfluxDB when received
    • Batch mode: the whole batch gets its own SINGLE timestamp by InfluxDB when received
  2. Timestamp everything. This means
    • Single Mode: every single measurement gets its own timestamp by the reporter when REPORTED from exo_core.
    • Batch Mode: The batch gets its own single timestamp by the reporter when the batch is actually SENT. However, this can be improved to give every single measurement a timestamp on REPORT from exo_core.

Regarding "quality" of timestamps: You can indeed overdo this by building timestamps within the code of your application and report these timestamps together with measurements. Generating timestamps within the reporter is a compromise and the developer has to ensure that the precisions are representative by chosing the options of the reporter well and placing the metric updates at the right positions.

hwinkel commented 8 years ago

Why still by the reporter? Thats not correct, as the time of measurement is not the reporting time, any reason for that? exometer limitation?

GalaxyGorilla commented 8 years ago

To be precise here: The timestamp is set when exo_core pushes a measurement to a reporter (when timestamping is enabled, in single mode). It's not the time when the reporter sends it to InfluxDB. That's why I wrote "reported from exo_core [to a reporter]".

Apart from that I think it is not possible without hacky workarounds to generate timestamps within the application itself and then push them together with the actual measurements to a reporter. See update/2 which is used to update a metric:

https://github.com/Feuerlabs/exometer_core/blob/master/src/exometer.erl#L195

hwinkel commented 8 years ago

Sounds nearly OK. Happy with exo handling the timestamps and not the app.

GalaxyGorilla commented 8 years ago

I close this for now since I see no further points of discussion.