grafana / carbon-relay-ng

Fast carbon relay+aggregator with admin interfaces for making changes online - production ready
Other
467 stars 151 forks source link

insights into which kinds of metrics are inputted #157

Open Dieterbe opened 7 years ago

Dieterbe commented 7 years ago

need some kind of module wherein you can define regex patterns, and count how many series matching the pattern are seen. using matching subpatterns to extract what you need, like so:

customer defines /^(benchmarks).[^.]*.([^.]*)/ customer sends metric benchmarks.foo.bar.xyz we match and create stats under stats.benchmarks.bar

perhaps this can be simplified more and use plain graphite syntax instead of regex.

ehlerst commented 7 years ago

Would a feature like this cause a big hit on CPU? This would be a very nice thing to have. I do something similar but with a find/wc bash script on legacy graphite. It is very slow to report for obvious reasons.

Dieterbe commented 7 years ago

a noticeable hit, yes. with regex it can become significant depending on how much other work the relay is doing (e.g. aggregations)

randallt commented 6 years ago

Something like this would be very helpful for finding metrics that are being sent too often. But I would propose that there should be a way to get counts for all metrics if possible. I recently had to undertake an exercise using 'tcpdump' (not my favorite) to find which metrics were being sent with what frequency. I was having spikes over 5 million per minute. I found that other teams were incorrectly buffering metrics to a file, sending the file to carbon at regular intervals, but never clearing the file. This was bringing our carbon/graphite system to its knees, with huge numbers of dropped metrics. I would give up plenty of CPU to be able to quickly see rough metric frequencies instead of having to drop to something like tcpdump.