Closed qinqon closed 3 months ago
@machadovilaca @avlitman please see if you can help in reviewing this PR.
@avlitman @machadovilaca @sradco can you take another look ?
@machadovilaca: changing LGTM is restricted to collaborators
@qinqon I propose you to also add docs generator for the operator. I think this is really useful.
We have the automation, so that when the user adds a PR with a new metric, the test runs and checks if the metric is already documented. If not the user is asked to run make generate and this automatically updated the PR with the change to the metrics.md file with the new metric, description and type.
See an example to the metrics.md file here https://github.com/kubevirt/hyperconverged-cluster-operator/blob/main/docs/metrics.md and the docs generator is here https://github.com/kubevirt/hyperconverged-cluster-operator/blob/main/tools/metricsdocs/metricsdocs.go (Note: we plan to move it to /monitoring/tools/ )
@qinqon I propose you to also add docs generator for the operator. I think this is really useful.
We have the automation, so that when the user adds a PR with a new metric, the test runs and checks if the metric is already documented. If not the user is asked to run make generate and this automatically updated the PR with the change to the metrics.md file with the new metric, description and type.
See an example to the metrics.md file here https://github.com/kubevirt/hyperconverged-cluster-operator/blob/main/docs/metrics.md and the docs generator is here https://github.com/kubevirt/hyperconverged-cluster-operator/blob/main/tools/metricsdocs/metricsdocs.go (Note: we plan to move it to /monitoring/tools/ )
@sradco introducing it, add a lot of golang dependencies to the project, I am not sure about it, maybe we can do this at follow up.
@sradco @machadovilaca I have convert the Counter to Guague and decrease if topology/feature is no longer in use, can you take another look to see if everything is ok from a monitoring perspective ?
We are going to steak with features for now since it has a limited bounds, we will investigate options for topology
@qinqon Can you please add an example to the PR description of the end metric with labels?
/retest
/retest
Trying to pull registry.access.redhat.com/ubi9/ubi-minimal:latest...
Error: creating build container: copying system image from manifest list: determining manifest MIME type for docker://registry.access.redhat.com/ubi9/ubi-minimal:latest: reading manifest sha256:119ac25920c8bb50c8b5fd75dcbca369bf7d1f702b82f3d39663307890f0bf26 in registry.access.redhat.com/ubi9/ubi-minimal: received unexpected HTTP status: 502 Bad Gateway
make[1]: *** [Makefile:168: push-operator] Error 125
make[1]: Lea
@sradco can you take another look ? I think I have cover all the comments.
/hold
nmstate from "base" branch is failing a metrics tests at the "future" lane.
/retest
locally with NMSTATE_PIN=future everything is fine.
/retest
/hold cancel
Now is ready
/retest
/lgtm
/approve
/retest
[APPROVALNOTIFIER] This PR is APPROVED
This pull-request has been approved by: qinqon
The full list of commands accepted by this bot can be found here.
The pull request process is described here
/retest
Is this a BUG FIX or a FEATURE ?: /kind enhancement
What this PR does / why we need it: Now that nmstatectl is able to calculate some useful stats from network configuration [1], we can bubble them up and expose them as k8s metrics so k-nmstate users can digg on them using prometheus, graphana or the like.
This change add a new "Features" under nnce Status with the output of
nmstatectl st
and also create a new deploymentnmstate-metrics
that will gather the NNCEs features and reflecta that at a cluster wide gaugue prometheus metric.This is an example of nmstate feature stat
Depends on nmstate 2.2.20, looks like it's build but still not present at centos 9 stream
[1] https://github.com/nmstate/nmstate/pull/2420
TODO:
Release note: