Closed rsmogura closed 8 months ago
Sending each measurement does not seem aligned with the goal of reducing cost. As far as I understand[0], CloudWatch charges per API call to store metrics, and sending each individual measurement would drastically increase the API calls to CloudWatch.
Other CloudWatch users limit the metrics sent with MeterFilters or by not binding MeterBinders for metrics they are not interested in.
It also adds significant overhead to send each measurement and the benefit isn't clear. Even if CloudWatch provides functions to aggregate, why is it beneficial to use those on individual measurements instead of summary data?
We generally want to avoid allowing inefficient configurations unless required by the metrics system. Let me know if I'm missing something, though.
[0] Based off the custom metrics example in CloudWatch pricing documentation
Hey folks,
I also need this and have taken a stab at it recently as I have a use-case is that where I need pXX stats across instances of a service, and the existing way of publishing histograms doesn't quite work for this (at least I can't think of a way to achieve this using CW)
Only way that would work is to publish histogram data in one single putMetricData
request when highResolution
metrics are enabled, based on this link.
Some thoughts
min/max/avg
will be wrongPersonally I'm leaning towards implementing both options, controlled with a config option and potentially with a suffix .approx
in the histogram based metric.
I've got a [working prototype in my fork](see here https://github.com/codemedian/micrometer/compare/master...codemedian:feature/cloudwatch-statistics-set?expand=1) but am not sure about the metric naming and how the config/defaults should behave
Additionally - can't find the ticket now, but I remember reading about it. I see this working as a whole as follows
1) create low-res micrometer registry for all metrics (publishes min/max/avg etc) as statistics sets
2) create a 2nd high-res registry with a metrics filter and use metricsFilter
to add DistributionStatisticConfig
to only the metrics desired which will then be published using the above mentioned way
@codemedian Would you mind sharing your use-case with us? We might be able to give alternative ideas.
Certainly.
I have a number of instances of one REST service running and am defining SLAs based on p99
on API call latency. I don't particularly care about which instance is having issues.
One way of approaching this would be to publish the p99
of each service as a client side percentile histogram and using a SEARCH
and MATH
function combine it into a single series. This however has the problem that you can't set CloudWatch alarms on Search results.
So, in order to get min/max/avg aggregated across instances, I dropped the instance
tag from the metrics, which requires the statistics set PR I raised to work properly for CW to understand how to aggregate the different metrics.
In the same fashion I now want to publish percentiles on all instances, yet without an instance label. The only way I can think of this working is to let CW do the percentile calc, i.e. publishing histogram data directly. Either every single value/count
combination for the time window, or alternatively an aggregate bucket based histogram (which is what I've attempted in my branch)
Added benefit of letting Cloudwatch do the percentile calc is that the pXX
options in the UI will work as expected
Hope that makes sense :)
Avg of avg from three instances is different than average of values from all three instances. Micrometer can’t handle this right now.
From the other hand each pseudo-statistics metrics cost.
That’s why we decide to store values in CloudWatch on our own.
Avg of avg from three instances is different than average of values from all three instances. Micrometer can’t handle this right now.
For avg of avg I believe my PR is of interest to you. Statistics Sets allow to publish client side aggregates and still preserve the CloudWatch functionality you're referring to while reducing the number of metrics down to one. Small caveat is that as of now this only works for Timers
and DistributionSummaries
.
edit: took a quick look at CW pricing, and the cost for one metric
for a month is roughly equivalent to one putMetricData
call per minute (with the latter losing out narrowly). So, a timer using the current MM approach (4 metrics) would cost you ~0.5$ per month. Alternatively you could have one metric and call put
4 times a minute for roughly the same price meaning that if you have less than 150 unique measurements/15 seconds then your approach is cheaper (unless my maths is off :D)
What about percentiles?
That's what I'm working on (see my first comment and the branch linked) but not yet sure how to proceed.
If you want to be CloudWatch idiomatic it boils down to either publishing all data points or a histogram approximation and marking the metric as a high-resolution one
I'm currently leaning towards supporting both with a config switch on the registry but any input/feedback appreciated
I think it’s impossible to provide percentile with statistics sets according to cloud watch documentation.
Generally, you push all data points to CloudWatch or you loose functionality you pay for.
I don't understand why you're referring to statistics sets here, I have made no such claim? That being said - technically pXX works with statistics sets IFF certain conditions hold true, see documentation
I do agree with you that every data point needs to be published for percentiles to work properly, however I think there is value in - maybe additionally - having client side histogram buckets and using those as approximation for datapoints?
Sir, I think you mentioned statistics sets.
My point is different, why to pay for CloudWatch to self limit and get only 10% of functionality?
😅 Misunderstanding then.
Re Statistics sets I was referring to min/max/avg for client side aggregates. Which gets you 100% functionality for normal, low res, metrics (which do not support percentiles in CW) from what I can tell.
For high resolution metrics and percentiles I 100% agree with you. It's also my preferred approach and I'll see if I can fit in into the current registry/behaviour in any way
If you would like us to look at this issue, please provide the requested information. If the information is not provided within the next 7 days this issue will be closed.
Closing due to lack of requested feedback. If you would like us to look at this issue, please provide the requested information and we will re-open.
Hi Micrometer Team,
I worked with micrometer and Cloud Watch and would like to have a functionality to send all data points (measures) to cloud watch, instead of aggregated statistics.
There few points why it would more beneficial for me:
I wonder if such approach is aligned with Micrometer approach.
In generally I was thinking in following about following changes:
Meter
Registers wanting to store data points instead of aggregated measures, could update them self to do this.