jsocol / pystatsd

A Python client for statsd
http://statsd.readthedocs.io/
MIT License
542 stars 177 forks source link

ability to send stated tags #83

Closed kidanekal closed 8 years ago

kidanekal commented 8 years ago

Looking at ways to send tags for a given metrics so that I can slice and dice data more easily.

It will be great if pystatsd support passing tags as follows.

statsd_client.incr('upload.request_count', tags=['env:local', 'db:remote'])

jsocol commented 8 years ago

Does statsd itself support tags? As far as I know, those are a datadog-specific extension to the statsd protocol.

I believe if you're using datadog you can use a regular statsd client, but if you want their extensions to it, you need to use the datadog client.

kidanekal commented 8 years ago

@jsocol right the official statsd server doesn't support tags but there are community tools that let us post tags into statsd server.

Feel free to close this issue, since it is more of an idea/ nice to have for a better grouping of data.

jsocol commented 8 years ago

This client is really targeting only the de facto spec provided by statsd itself. I'm only aware of datadog's extension to support tags, and I'm not sure there's a single standard for how to send them to any other server (let alone a way to support them using graphite/carbon/whisper).

Happy to reconsider if tags ever make it into StatsD, but for now, yeah, I'm going to close this out.

toddjames commented 7 years ago

@jsocol statsd tagging using the DataDog format has found its way into the Telegraf statsd server. Telegraf is the de facto method to get data into InfluxDB (which can then be displayed with Grafana, Graphite, etc.). In fact, to use some more advanced features of Grafana like templating, tagging capabilities at the input source (statsd in this case) is a requirement.

Would you please reconsider adding this extended feature? Though it's not in the official implementation, it's becoming more useful outside the proprietary DataDog use cases now that Telegraf supports it.

FWIW, the implementation is simple, and I've already made the necessary changes on my local copy of pystatsd to get this functionality. I'd be happy to submit a PR, but testing is something I have no experience with.

For reference, tags look like this in the protocol (from the Telegraf source code):

        // users.online:1|c|@0.5|#country:china,environment:production
        // users.online:1|c|#sometagwithnovalue

My implementation added a _build_tag() method in StatsClientBase:

    def _build_tag(self, tag, value):
        if value:
            return '%s:%s' % (str(tag), str(value))
        else:
            return tag

I added a tags=None argument to StatsClientBase._prepare() and added this before the return statement:

        if tags:
            tag_string = ','.join(self._build_tag(k, v) for k, v in tags.items())
            return '%s:%s|#%s' % (stat, value, tag_string)

Then simply add a tags=None argument to the other methods and pass tags on as required.

edit: I should note that this implementation then accepts a dictionary of values to be passed as tags. For example:

client.gauge('cpu.percent', 40, tags={'fqdn': 'myserver.example.com', 'production': None})

which becomes |#fqdn:myserver.example.com,production

jsocol commented 7 years ago

I'm still not really thrilled about adding these, for a few reasons:

I think the appropriate way to handle this would be a forked version, statsd-tags or telegraf-client (maybe that already exists?) or something. That lets you opt into the behavior because you know your stack supports it, without needing to double the API here.

toddjames commented 7 years ago

Thanks for your input. I ended up forking the project and adding tag support there.

Poogles commented 7 years ago

@jsocol Unsure if your opinions on this have changed, but I gave this a quick run through here which implements a limited subset of the datadog/telegraph extension.

Before looking into the behaviour of the etsy and bitly daemons to these extras, does the extra tags keyword argument in the api seem OK and would you be willing to accept a PR around this?

If not no worries, I can see why you wouldn't.

jsocol commented 7 years ago

@Poogles My opinion hasn't changed, and everything I've said before remains true. In particular, at best, it's a parameter that does nothing, and at worst, it prevents metrics from being recorded.

Having spent a lot of time working with DataDog over the past couple years, I'm even more convinced. If you have tags, it fundamentally changes the way you name metrics and think about aggregation. With tags, you aggregate by default and disaggregate via tags; but without them, you have to reverse that. That is not something that should be adopted lightly, and I think is even more reason to make the opt in to that behavior extremely explicit. Requiring a new client to add/remove tags makes you touch every file that uses it, which is still below the minimum level of thought to put into the switch.