hashicorp / nomad

Nomad is an easy-to-use, flexible, and performant workload orchestrator that can deploy a mix of microservice, batch, containerized, and non-containerized applications. Nomad is easy to operate and scale and has native Consul and Vault integrations.
https://www.nomadproject.io/
Other
14.86k stars 1.95k forks source link

Client metrics should use Job ID (not Job Name) for the job field. #8119

Open toadstule opened 4 years ago

toadstule commented 4 years ago

Nomad version

0.10.5

Operating system and Environment details

Linux (CentOS 7)

Issue

Nomad server metrics (such as _nomad.jobsummary.complete) include a job field that is populated with the Job ID. Nomad client metrics (such as nomad.client.allocs.cpu.allocated) also include the same job field -- but it is populated with the Job Name (rather than the Job ID).

Reproduction steps

Launch a Nomad job where the name and ID do not match. (In this case, my job ID is exampleID; my job name is example).

Pull metrics from the leader server:

curl -sk https://nomad-server1:4646/v1/metrics?format=prometheus | grep example | head -n 1
nomad_nomad_job_summary_complete{host="nomad-server1",job="exampleID",namespace="default",task_group="example"} 22

Pull metrics from the client node:

curl -sk https://nomad-agent1:4646/v1/metrics?format=prometheus | grep example | head -n 1
nomad_client_allocs_cpu_allocated{alloc_id="039fc864-aae8-af37-e29a-fde2634e5fae",host="nomad-agent1",job="example",namespace="default",task="example-main",task_group="example"} 60

Code examples

Leader server metrics use summary.JobID https://github.com/hashicorp/nomad/blob/master/nomad/leader.go#L742

Client metrics use alloc.job.Name https://github.com/hashicorp/nomad/blob/master/client/allocrunner/taskrunner/task_runner.go#L372

Notes

Personally, I would prefer that both metrics endpoints include the Job ID (rather than the Job Name), but the fact that these are inconsistent makes it difficult to view metrics for a given job from both the server and the agent node.

Another option, if you do not want to change the existing fields, would be to add a _jobid field that is consistently the Job ID.

tgross commented 4 years ago

Hi @toadstule! I'm pretty sure that should be the Job ID already. I'll mark this as a bug in the metrics code to investigate. Thanks for opening this!