grafana / carbon-relay-ng

Fast carbon relay+aggregator with admin interfaces for making changes online - production ready
Other
467 stars 151 forks source link

Making metric name validation optional #82

Closed rtkrruvinskiy closed 8 years ago

rtkrruvinskiy commented 9 years ago

We need to relay a large number of metrics, and some of them fail validation and are tagged as "bad metrics" because of multiple consecutive dots or the presence of a colon (:) in the metric name. The metric names are not entirely under our control, and we can't easily make them comply.

Would you be amenable to putting in a configuration option to turn off validation, at least for "legacy" metrics?

Dieterbe commented 9 years ago

yeah. when i built this feature i knew it wouldn't be appropriate for some people... even better perhaps, how about different levels? like loose vs strict vs medium. consecutive dots and colons aside, there are probably things you still want to avoid (like non-ascii chars)

rtkrruvinskiy commented 9 years ago

Sure, I'd be fine with that. What checks would the levels correspond to?

Dieterbe commented 9 years ago

strict = current validation. none = no validation medium would prevent very crazy stuff (non ascii characters, nulls, ...) and stuff that is known to break graphite (can't remember, but there's a char or two that graphite doesn't handle well), but still permits discouraged things like unusual ascii chars, duplicate dots, etc.

rtkrruvinskiy commented 9 years ago

Okay. Is this sort of thing already on your todo list, or would you be looking for a PR from the outside?

Dieterbe commented 9 years ago

outside PR. i don't use carbon-relay-ng anymore and don't have much time to work on it.

rtkrruvinskiy commented 9 years ago

Have you switched to using something else, or is it that you don't work with metrics anymore?

Dieterbe commented 9 years ago

oh i definitely still work with metrics! (i joined raintank, an 100% open source monitoring company) the thing about carbon-relay-ng is because it uses the graphite proto (whether plaintext or pickle), you don't have acks. so if an endpoint goes down, carbon-relay-ng just sends the last i-forgot-i-think-like-1000-or-so metrics and hopes that that covers whatever the endpoint misses. the graphite proto doesn't have a means for an upstream sender to know what was successfully processed and what not so you have to work around that.

at raintank, we've been using rabbitmq and will actually switch to NSQ soon. our backend tsdb right now is kairosdb, which does provide acking. we don't really need filtering metrics streams (yet) so we're okay with using a general purpose messaging system that doesn't "work in" the graphite domain (parsing messages, routing based on key, etc). we also wanted something that has more durability guarantees and allows easily creating distributed topologies. nsq is really good at that.

Dieterbe commented 8 years ago

@rtkrruvinskiy actually you may be interested to know that raintank decided to pick up carbon-relay-ng, as we use it to send metrics to our hosted metrics platform. so if you've noticed some increased activity lately, this would explain :)