cloudfoundry / loggregator-agent

Archived: Now bundled in https://github.com/cloudfoundry/loggregator-agent-release
Apache License 2.0
3 stars 6 forks source link

Origin and Delta is missing in `Metron Counter Event` #11

Closed johha closed 5 years ago

johha commented 5 years ago

Hi,

we are updating our landscapes from CF v5.5 to v6.5. During testing we discovered that some counter events have changed.

Using the CF nozzle plugin, we see that the origin and delta fields are not populated correctly anymore: CF 5.5 (loggregator 104.0 / loggregator-agent 2.3)

origin:"loggregator.metron"
eventType:CounterEvent timestamp:1545055762713495887 
deployment:"cf" job:"diego-cell" index:"4aa468d9-3a5b-4192-8bd2-8dd0ad93c626" ip:"10.0.138.206" 
tags:<key:"metric_version" value:"2.0" > tags:<key:"source_id" value:"metron" >
counterEvent:<name:"egress" delta:2646 total:11207103 >

CF 6.5 (loggregator 104.1 / loggregator-agent 3.2)

origin:"" 
eventType:CounterEvent timestamp:1545054696439141513
deployment:"cf" job:"diego-cell" index:"f48acf1b-e325-41e2-8b2a-34738c655062" ip:"10.1.9.0" 
tags:<key:"source_id" value:"metron" > 
counterEvent:<name:"egress" delta:0 total:304995 >

Can you check if this is bug or did we miss something in the release notes?

Thanks and best regards, Jochen & Johannes

cf-gitbot commented 5 years ago

We have created an issue in Pivotal Tracker to manage this:

https://www.pivotaltracker.com/story/show/162702016

The labels on this github issue will be updated when the story is started.

bradylove commented 5 years ago

@johha The missing origin is a bug in cf-deployment that was fixed with this PR.

As for the missing delta, we changed the way the Loggregator Agent emits metrics and will now only emit the totals.

bradylove commented 5 years ago

@johha I am going to close this issue for now. Please feel free to re-open if you have further concerns about the bug or the change to the metrics not sending the delta.

jtuchscherer commented 5 years ago

@johha Just to expand a bit on dropping the delta: Most downstream consumers (Prometheus, Infux, Datadog, OpenTSDB...) have built-in functions to calculate the rate/derivative of counters and most metric libraries don't do well with the emission of deltas. Therefore, we are slowing moving to a system where we discourage the usage of deltas and prefer totals.

Do you have a specific setup or use-case that would require the deltas? If yes, I would be very interested in hearing about it.

jochenehret commented 5 years ago

Thanks for the quick clarification! We currently use the delta to sum up all CF.loggregator.metron..*.egress values so that we get the an overall sent messages/sec value. This is used to detect unusual high logging loads (e.g. caused by chatty apps) in a CF installation. If the delta is not available any more, we'll have to calculate it using Riemann's derivative functions.