hashicorp / nomad

Nomad is an easy-to-use, flexible, and performant workload orchestrator that can deploy a mix of microservice, batch, containerized, and non-containerized applications. Nomad is easy to operate and scale and has native Consul and Vault integrations.
https://www.nomadproject.io/
Other
14.89k stars 1.95k forks source link

Document order of tags in metrics documentation #11055

Open Fuco1 opened 3 years ago

Fuco1 commented 3 years ago

Issue

On https://www.nomadproject.io/docs/operations/metrics#host-metrics the labels (tags) are listed alphabetically. However, this is not how labels are processed in the statsd protocol. There, the tags (values) are dot-separated and labels are only implied by order.

So the output of the metrics might be a string:

nomad.client.host.cpu.total.node_id.datacenter.node_class.node_status.node_scheduling_eligibility.cpu

This is relevant for later parsing and assigning the labels to tags. For example, with telegraf you make a template such as

"nomad.client.host.cpu.* measurement.measurement.measurement.field.field......cpu"

This would produce series nomad_client_host with a field cpu_total (when the measurement and field are repeated they are joined by _ into a single identifier). The empty .. omit a value, any other name is the label for that position's value.

(documentation can be found here https://github.com/influxdata/telegraf/tree/master/plugins/inputs/statsd#statsd-bucket---influxdb-line-protocol-templates)

I had to find the actual order from the source: https://github.com/hashicorp/nomad/blob/dfb313a6da0954501cc79325ee227c01e7596097/client/client.go#L2797-L2801 and https://github.com/hashicorp/nomad/blob/dfb313a6da0954501cc79325ee227c01e7596097/client/client.go#L2837-L2840

It would be very helpful to list the tags in order they are produced instead of alphabetically. Also, when the order changes all the metrics collection using something like statsd and telegraf can break since it is order-based.

lgfa29 commented 3 years ago

Thank you for the report @Fuco1, I don't know how common knowledge this is, but I for one learned something new today 😄

I don't think our docs are incorrect since labels are often thought of as unsorted <key>:<value> pairs, but as you pointed out this is not true for all systems, so we should document their order and try not change it.