project-codeflare / multi-cluster-app-dispatcher

Holistic job manager on Kubernetes
Apache License 2.0
108 stars 63 forks source link

Add histogram metrics of the requested resources #676

Open ronensc opened 1 year ago

ronensc commented 1 year ago

Issue link

What changes have been made

This PR adds custom histogram metrics of a of the requested resources (CPU, Memory and GPU) by the AppWrappers.

Verification steps

Like in #651 and #675, I checked that the metrics are available in the codeflare-operator operator by running it:

# in codeflare-operator root
go mod edit -replace github.com/project-codeflare/multi-cluster-app-dispatcher=../multi-cluster-app-dispatcher
make build
NAMESPACE=default go run ./main.go -kubeconfig <path-to-kubeconfig>

and in a new shell:

curl localhost:8080/metrics | grep mcad_requested

# HELP mcad_requested_cpu Histogram of requested CPU (in millicores)
# TYPE mcad_requested_cpu histogram
mcad_requested_cpu_bucket{le="100"} 0
mcad_requested_cpu_bucket{le="200"} 0
mcad_requested_cpu_bucket{le="500"} 0
mcad_requested_cpu_bucket{le="1000"} 0
mcad_requested_cpu_bucket{le="2000"} 0
mcad_requested_cpu_bucket{le="5000"} 0
mcad_requested_cpu_bucket{le="10000"} 0
mcad_requested_cpu_bucket{le="+Inf"} 0
mcad_requested_cpu_sum 0
mcad_requested_cpu_count 0
# HELP mcad_requested_gpu Histogram of requested GPU
# TYPE mcad_requested_gpu histogram
mcad_requested_gpu_bucket{le="1"} 0
mcad_requested_gpu_bucket{le="2"} 0
mcad_requested_gpu_bucket{le="4"} 0
mcad_requested_gpu_bucket{le="8"} 0
mcad_requested_gpu_bucket{le="16"} 0
mcad_requested_gpu_bucket{le="32"} 0
mcad_requested_gpu_bucket{le="64"} 0
mcad_requested_gpu_bucket{le="128"} 0
mcad_requested_gpu_bucket{le="256"} 0
mcad_requested_gpu_bucket{le="512"} 0
mcad_requested_gpu_bucket{le="+Inf"} 0
mcad_requested_gpu_sum 0
mcad_requested_gpu_count 0
# HELP mcad_requested_memory_bytes Histogram of requested memory
# TYPE mcad_requested_memory_bytes histogram
mcad_requested_memory_bytes_bucket{le="1.34217728e+08"} 0
mcad_requested_memory_bytes_bucket{le="2.68435456e+08"} 0
mcad_requested_memory_bytes_bucket{le="5.36870912e+08"} 0
mcad_requested_memory_bytes_bucket{le="1.073741824e+09"} 0
mcad_requested_memory_bytes_bucket{le="2.147483648e+09"} 0
mcad_requested_memory_bytes_bucket{le="4.294967296e+09"} 0
mcad_requested_memory_bytes_bucket{le="8.589934592e+09"} 0
mcad_requested_memory_bytes_bucket{le="1.7179869184e+10"} 0
mcad_requested_memory_bytes_bucket{le="3.4359738368e+10"} 0
mcad_requested_memory_bytes_bucket{le="6.8719476736e+10"} 0
mcad_requested_memory_bytes_bucket{le="+Inf"} 0
mcad_requested_memory_bytes_sum 0
mcad_requested_memory_bytes_count 0

Checks

openshift-ci[bot] commented 1 year ago

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: Once this PR has been reviewed and has the lgtm label, please assign anishasthana for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files: - **[OWNERS](https://github.com/project-codeflare/multi-cluster-app-dispatcher/blob/main/OWNERS)** Approvers can indicate their approval by writing `/approve` in a comment Approvers can cancel approval by writing `/approve cancel` in a comment