statsite / statsite

C implementation of statsd
http://statsite.github.io/statsite/
Other
1.82k stars 242 forks source link

Any plans to support integration with Prometheus #224

Open dannyk81 opened 7 years ago

dannyk81 commented 7 years ago

With StatsD we can use the "repeater" option to pipe the metrics to Prometheus StatsD Exporter, but I don't see a way to achieve this with statsite.

Any ideas ?

johnkeates commented 7 years ago

I'm not sure what repeater in statsd does, but if it simply means 'to send to another endpoint', you could add an additional python sink for that if you need it.

dannyk81 commented 7 years ago

Thanks for the response :)

Here's what StatsD repeater does:

Repeater (repeater): Utilizes the packet emit API to forward raw packets retrieved by StatsD to multiple backend StatsD instances.

Basically, it sends the incoming packets in raw format (as-is) to a another StatsD instance, in this case it's Prometheus StatsD Exporter which translates the metrics to Prometheus format and exposes them to be scraped by Prometheus:

+----------+                     +-------------------+                        +--------------+
|  StatsD  |---(UDP repeater)--->|  statsd_exporter  |<---(scrape /metrics)---|  Prometheus  |
+----------+                     +-------------------+                        +--------------+
johnkeates commented 7 years ago

Yes, that can be done with statsite as well. The stream_cmd basically takes any command, including combining a sink with the tee command, or chaining sinks, or you could create a sink that simply relays. The binary sink might be a good start: https://github.com/statsite/statsite/blob/master/sinks/binary_sink.py

dannyk81 commented 7 years ago

Thanks for the guidance! we'll need to figure out how to write this sink.

drawks commented 7 years ago

FWIW I don't think it is possible to write a stream cmd that would work as a repeater sink, since statsite always aggregates what it receives and always flush the results of aggregation. The repeater feature of etsy statsd just forwards all statsd instrumentation events unaggregated to another statsd instance via tcp.

It would be VERY useful to have the same repeater functionality in statsite.

drawks commented 7 years ago

Was just considering this some more, I /think/ it wouldn't be too hard to implement this feature by having an option similar to stream_cmd which forks a relay sink that would just get all the statsd updates written to stdin. This would allow for arbitrarily acting on non-aggregated statsd events; so that a repeater sink could be as simple as nc other-statsd.foo.com 8125.

leoluk commented 6 years ago

You can use statsite with graphite_exporter:

https://github.com/prometheus/graphite_exporter

drawks commented 6 years ago

@leoluk yes, there are plenty of ways to plumb up the aggregated metrics from statsite, there is still no way to forward the raw statsd event stream, which is what is wanted here.

johnkeates commented 6 years ago

You can forward the raw event stream using /dev/tcp or /dev/udp or netcat or perhaps socat. But I believe prometheus uses polling by default so I'm not sure if fading is what you need.

drawks commented 6 years ago

Yes you can tee upstream of statsite, I think the initial ask boils down to "the reference statsd server implementation includes the ability to forward raw unaggregated statsd events to other destinations. It sure would be nice if statsite also had this same feature."

If the answer from statsite project is "we have no interest in implementing that." it seems reasonable to say such, but would also be nice if it was clearly documented that this was asked and answered AND is a feature from the reference implementation which is not currently available.

That said I think there is some definite value in having feature parity with the reference implementation.

johnkeates commented 6 years ago

I'm not sure if statsite should be considered referred to statsd. The author wrote it to be statsd protocol wire-compatible AFAIK.

Also, as far as I know, you an pretty much tell statsite about any destination you like, that's why stream_cmd exists: you can stream all the stuff it has anywhere.

I suppose someone could make an alias called stream_forward_to_somewhere_else with the exact same parameters, but it would do that same.

This issue began with:

With StatsD we can use the "repeater" option to pipe the metrics to Prometheus StatsD Exporter, but I don't see a way to achieve this with statsite.

The way to achieve this is: stream_cmd=nc 1.2.3.4 5678 or something like that, and if you want more than one stream, there is (as you suggested) tee.

If you want to have nc bolted on to statsite, that could be done, but I'm not sure how that would help, it does exactly the same thing...

drawks commented 6 years ago

@johnkeates stream_cmd does not forward raw unaggregated statsd events, this has been said and corrected multiple times in this thread. The stdout stream provided to stream_cmd is aggregated values which have been calculated over the previous interval before each flush.

johnkeates commented 6 years ago

@drawks ah, then it makes sense. I suppose nc+tee in front of statsite would be a solution to that, but the UDP repeater makes more sense now.

I'm not a C/C++ rockstar, but if someone has a piece of code that can take an existing socket in C and repeat everything to a new socket pointing elsewhere, I'm happy to integrate it. A patch/MR that completely does this would be even more welcome.

leoluk commented 6 years ago

I suggest to rename the issue and create a separate one for documenting best practices for Prometheus integration.

johnkeates commented 6 years ago

Maybe we need both implementations, one before aggregation and one afterwards.

drawks commented 6 years ago

heh, that sounds an awful lot like what I suggest back in May ;)

leoluk commented 6 years ago

Probably so, different trade-offs for both (aggregation, counter resets, expiry/staleness handling...)

johnkeates commented 6 years ago

Yeah, I've been re-reading some of the stuff here (was first reading it on a tiny screen, missed a lot). Also, indeed, different trade-offs, but you at least get to pick, which is good for diverse environments.