blacklocus / metrics-cloudwatch

A reporter for codahale metrics to Amazon CloudWatch
Apache License 2.0
61 stars 27 forks source link

Dropwizard Percentiles -> CloudWatch #11

Closed taltmann closed 8 years ago

taltmann commented 9 years ago

I added the ability to pass in percentiles (or use hopefully sane defaults) that a caller wants to track for Sampling metrics. Currently, it creates a new CloudWatch MetricDatum (StatisticSet) for each of the DemuxedKey names, extending the "type" with .p{percentile} -- e.g. com.example.getUserById.p995 for the 99.5th percentile.

The percentile configuration resulted in a bit of a constructor explosion, but I didn't want to break existing constructors. Perhaps using Guava Optionals for the optional parameters would be a better approach, but doing that across the board would result in some non-backward compatible changes, and I wasn't sure how open this project is to that.

I'd love to get feedback and thoughts on improvements/modifications.

dirkraft commented 8 years ago

Hi, I'm late because of job. Sorry. :sleeping:

I went through the thought experiment of translating such computed or derived metrics into cloudwatch for convenience, but decided against it in the initial implementation since it could easily lead to convoluted explanations of aggregations over such metrics. It could be useful, but I was ready to open that corner of Pandora's box; the discussions themselves are non-trivial.

Example: I have two machines contributing percentiles to the same metric. One machine, for whatever reason, happens to process 2x more work items than the other. What do I do with those metrics? How do I aggregate over them? It doesn't make a lot of sense. What would make more sense would be to compute the percentiles from the raw data points (or statistic sets) that were logged to CloudWatch by means of a tool or UI which made those computations on the fly.

Now, given that, I am still grateful for your open question. It is a vote for certain features, and it will help me figure out the future feature list for a possible 2.0 attempt. I will be sure to include special consideration for plugging in arbitrary additional submissions based on whatever data and state that is available in the codahale metrics objects, so that say if you do indeed have a single process contributing to some percentile metric, and you don't really care about aggregating over multiple submissions later, then it should be easier to plug that in.

taltmann commented 8 years ago

Thanks for considering.

I've learned some things since I originally submitted that reinforce your position, in particular around aggregation scenarios similar to the one that you describe. So, while I would like to have access to percentiles in CloudWatch, you're right that this change is not the way to accurately get them there.