sensu / sensu-go

Simple. Scalable. Multi-cloud monitoring.
https://sensu.io
MIT License
1.03k stars 175 forks source link

Add support for output_metric_tags #2160

Closed calebhailey closed 4 years ago

calebhailey commented 6 years ago

Expected Behavior

Add a new output_metric_tags attribute to the Check resource/specification. Allowed values should provide dot-notation access to any event attribute (e.g. entity.id, entity.organization, entity.system.os, check.name, etc). Here's an example of how it might be implemented:

type: Check
api_version: core/v2
metadata:
  namespace: default
  name: check_sensu_io
spec: 
  command: "check_http -u https://sensu.io/ -H sensu.io -N",
  runtime_assets: 
  - check_http_v0.1
  publish: true
  interval: 10
  subscriptions: 
  - docker
  output_metric_format: nagios_perfdata
  output_metric_handlers:
  - influxdb
  output_metric_tags: 
    entity: entity.name 
    namespace: entity.namespace
    app: entity.metadata.labels.application_id

Bonus points if we could support access to array objects (e.g. entity.subscriptions[0] to get the first subscription), or if we supported arbitrary tags (i.e. not matching an event attribute).

Current Behavior

No support is provided for collecting tags during metric output extraction.

Possible Solution

See description above.

Context

Currently, when using metric output extraction with the trusty Nagios check_http plugin, for a check output that looks like HTTP OK: HTTP/1.1 301 Moved Permanently - 280 bytes in 0.043 second response time |time=0.043044s;;;0.000000;10.000000 size=280B;;;0, the corresponding event will have empty tag values as shown in this event excerpt:

{
  "timestamp": 1539194388,
  "entity": {
    "class": "agent",
    "id": "46e492cdd189",
    "...": "..."
  },
  "check": {
    "name": "check_sensu_io",
    "command": "check_http -u https://sensu.io/ -H sensu.io -N",
    "output": "HTTP OK: HTTP/1.1 301 Moved Permanently - 197 bytes in 0.034 second response time |time=0.033662s;;;0.000000;10.000000 size=197B;;;0\n",
    "...": "...",
    "metrics": {
      "handlers": [
        "influxdb"
      ],
      "points": [
        {
          "name": "time",
          "value": 0.033662,
          "timestamp": 1539194388,
          "tags": []
        },
        {
          "name": "size",
          "value": 197,
          "timestamp": 1539194388,
          "tags": []
        }
      ]
    }
  }
}

The challenge with this is that when sent to InfluxDB, these appear as very generically named measurements (i.e. "time" and "size"), with no context about which system or service reported the metrics (because there are no tags). Compared with the graphite-style measurement name you would get with Sensu 1.x (e.g. 46e492cdd189.check_sensu_io.time), this leaves something to be desired, even though simple measurement names with tags would be a more native data representation in InfluxDB.

Your Environment

portertech commented 6 years ago

1508

calebhailey commented 6 years ago

Commenting here to capture for posterity that @nikkiki suggested in Slack that the best way to do this is probably with Tokens... 💯

calebhailey commented 6 years ago

In further discussion on this issue, it is worth noting that this really only impacts metrics collected in formats that don't natively support tags (e.g. Nagios PerfData and Graphite plaintext). Metrics collected in formats that support tags will have their tags extracted or "passed through".

The good/bad news about that is most of the Sensu Community Plugins have standardized around Graphite plaintext format for metrics collection.

calebhailey commented 4 years ago

image

/cc @portertech 😆

raags commented 4 years ago

Can something similar be done for statsd? i.e metrics generated from statsd can be enriched with the entity metadata that sensu client has.