pyr / cyanite

cyanite stores your metrics
http://cyanite.io
Other
446 stars 79 forks source link

ability to use sum, min, max, and last #244

Closed ehlerst closed 7 years ago

ehlerst commented 7 years ago

Graphite allows

[counts]
pattern = \.count$
xFilesFactor = 0.1
aggregationMethod = sum

Perhaps something like this

engine:
  rules:
    "*\\.count": [ "5s:1h", "30s:1d" ]
      aggregation: "sum"
ifesdjeen commented 7 years ago

We have sum, avg, min, max. We aggregate them for all metrics. I'll implement last and come up with reasonable syntax for aggregation in config file.

ehlerst commented 7 years ago

Oh, i see now. All metrics get aggregates. Hmm how does sum get selected when you make a query with graphite-api ?

ifesdjeen commented 7 years ago

There's no way right now. Which is a bit stupid, but tbh we never got to do it, sorry.

tehlers320 commented 7 years ago

Whats in the table for anybody else who reads this later:

  {path: 'benchmark.chunk_93.metric_3080.count', resolution: {precision: 30, period: 7776000}} | 1474471290 | {max: 10, mean: 10, min: 10, sum: 10}
 {path: 'benchmark.chunk_29.metric_15056.count', resolution: {precision: 30, period: 7776000}} | 1474470510 | {max: 10, mean: 10, min: 10, sum: 10}

I wonder if we could just put a &point=sum into the API to change these. That would require updates to the finder then as well to use.

On the other hand following the precedence graphite has set today regex is used to choose which aggregation you use .count$ . Im not sure how that would work. Feed the config a big regex config to use different things?

ifesdjeen commented 7 years ago

You mean benchmark.chunk_93.metric_3080.count.mean$ vs benchmark.chunk_93.metric_3080.count.max$?..

tehlers320 commented 7 years ago

hmm no, most users arent going to have the ability to change their metric name.

I mean the https://github.com/pyr/cyanite/blob/master/src/io/cyanite/api.clj#L131 does some regex based on a config to choose its point. Sorry if that's not the right place in the code, i'm Opsdev not Devops.

ifesdjeen commented 7 years ago

I was thinking of doing it API-only.. So we'll expose 4 metrics on the bottom-most level (and more when there's more). So you'll be storing metrics the way you usually do. Querying them same way. But appending .max would change the output aggregate.

Alternatively, we could add a function _avg(my.metric), _max(my_metric) (this will be much harder though).

tehlers320 commented 7 years ago

Ahh, so this would just show up as an extension of the graphite tree? That sounds perfect other than the odd scenario of

benchmark.chunk_93.metric_3080.max -> benchmark.chunk_93.metric_3080.max.max

ifesdjeen commented 7 years ago

After giving it another thought I came to conclusion it'll be simpler (and less confusing for grafana) for any metric to have 5 metrics in fact:

They'll be showing up along with the rest of metrics in autocomplete. I'm already working on it, but don't yet have any ETA.

tehlers320 commented 7 years ago

Is this going to increase the table size x5, or are you doing some code.foo appending the _$ onto existing table entrys? 10 hours of 4.25 million metrics per minute with replica 2 will be ~14GB (confirmed in live install) . That's not so bad until you get into the longer storage ranges even with aggregates. With a 15m:365d aggregate the same amount should be around 266gb for the year.

That's 1.35TB if it multiplies by 5.

Thank you for all your amazing work btw.

ifesdjeen commented 7 years ago

They'll just be "virtual" metrics. They won't exist physically (nothing will change for you, since we already store them), we'll just map to them internally. Right now I'll just add support to allow you to access them. We're still working on compression stuff, unfortunately it's much more involved than implementing this feature.

ifesdjeen commented 7 years ago

Prototype ready in #248