Nomad is an easy-to-use, flexible, and performant workload orchestrator that can deploy a mix of microservice, batch, containerized, and non-containerized applications. Nomad is easy to operate and scale and has native Consul and Vault integrations.
Currently the /client/stats endpoint has a variety of utilization information including CPUTicksConsumed and Memory.Used which can be used to determine percent utilization for the host. This endpoint cannot currently be used to determine the percent utilization for a host limited to allocations.
The output for nomad node status contains the following, which includes allocated resource utilization:
Allocated Resources
CPU Memory Disk
1000/38400 MHz 512 MiB/32 GiB 300 MiB/292 GiB
Allocation Resource Utilization
CPU Memory
14/38400 MHz 1.9 MiB/32 GiB
Host Resource Utilization
CPU Memory Disk
2258/38400 MHz 22 GiB/32 GiB 156 GiB/466 GiB
Allocations
ID Node ID Task Group Version Desired Status Created Modified
3029bd2a e939dc3a cache 0 run running 6s ago 4s ago
It achieves this by first getting the allocations on the client and then aggregating their individual resource utilization via /client/allocation/:id/stats
Proposal
Move this aggregating logic into the API layer.
This would allow the UI to also present this information without making an excessive amount of API requests (especially considering the UI polls this endpoint on a 2s interval).
Consideration: ACLs
Allocation stats are dictated by the namespace:read-job permission while client stats are dictated by node:read. As part of this proposal, we're acknowledging that allocation stats in aggregate are acceptable to read with the node:read permission.
Response Shape
The allocation stats response already aggregates the stats figures and returns the shape:
The client stats response can take this same shape and further aggregate all allocations. The property name should be something like AllocatedResourceUsage or AllocationResourceUsage
Hidden benefit
As @cgbaker pointed out, aggregating all alloc stats at once on the client saves us from round-tripping from the server to the client N times as is currently the case with the CLI implementation.
Problem
Currently the
/client/stats
endpoint has a variety of utilization information includingCPUTicksConsumed
andMemory.Used
which can be used to determine percent utilization for the host. This endpoint cannot currently be used to determine the percent utilization for a host limited to allocations.The output for
nomad node status
contains the following, which includes allocated resource utilization:It achieves this by first getting the allocations on the client and then aggregating their individual resource utilization via
/client/allocation/:id/stats
Proposal
Move this aggregating logic into the API layer.
This would allow the UI to also present this information without making an excessive amount of API requests (especially considering the UI polls this endpoint on a 2s interval).
Consideration: ACLs Allocation stats are dictated by the
namespace:read-job
permission while client stats are dictated bynode:read
. As part of this proposal, we're acknowledging that allocation stats in aggregate are acceptable to read with thenode:read
permission.Response Shape The allocation stats response already aggregates the stats figures and returns the shape:
The client stats response can take this same shape and further aggregate all allocations. The property name should be something like
AllocatedResourceUsage
orAllocationResourceUsage
Hidden benefit As @cgbaker pointed out, aggregating all alloc stats at once on the client saves us from round-tripping from the server to the client N times as is currently the case with the CLI implementation.
Related Issues
6892
8694