Open MatheusArleson opened 3 years ago
With AWS CloudWatch's custom metric pricing model, reporting zeros on infrequent metrics drives up the price for that metric. CloudWatch charges $ per metric per hour. For each hour that has no data for a metric, the price is zero. The current "publishing zero's" approach makes the cost 100% instead of 5% for a metric only used once per day, or 33% for metric only used during 8 hours of peak.
I expect this specific feature would 2x-20x the overall cost of our metrics, as we have many infrequently used dimensions. (infrequent APIs, limited time use apis and error metrics). Which is not insignificant.
The solution of removing counters after n minutes of inactivity is very acceptable for me, though I would prefer a solution provided by Micrometer to avoid maintenance/bugs as mentioned above.
develop own code to workaround a lib behavior, this could very quickly lead to bugs
Was anyone able to find a workaround for this scenario? One other problem is: each 0 datapoint that is not intended to be send by the application is consuming unnecessary license and causing wrong statistics for the application. 0 should not be send on idle steps.
The issue
Micrometer publishes zero valued metrics when values have not changed in a step interval of time.
The Rational - In Favor
A zero value in this case means no additional samples were seen in this interval. Not sending anything at all could suggest that the application was not able to deliver a value at all to the monitoring system. So I don't believe sending a zero is a waste.
I'm interested in your feedback on this idea, but I've often seen this conversation go a certain way:
X: I don't want to ship 0 values for counts (sums, maxes, etc.) Me: Why? X: It takes up space for no reason Me: How much space would have been consumed if exactly one event happened per interval? (Answer: the same amount of space that all of these zeroes plus any non-zero values take) X: But this counter (timer, summary, etc.) is bursty. Me: I worry first about a capacity plan that relies on a certain periodic shape in your traffic. Failing this, you could use MeterRegistry.remove when the counter goes un-utilized.
The Rational - Against it
The Fix - Micrometer Approach
IMHO: Until the point 2 is implemented, it is not a good approach to develop own code to workaround a lib behavior, this could very quickly lead to bugs.
The Fix - Suggestions
Then queries could discard the metrics holding the marker and use the rest of the data. Just don't use zero, this causes problems since it makes sense for the background math.
Question
Can we implement the marker strategy instead of the zero value submit ?
Related Issues