Open kallaics opened 2 months ago
I've tested the flux2-monitoring-example and verified we were using kube-state-metrics v2.12.0, it does not seem to resolve the issue completely, though some metrics came back, in https://github.com/fluxcd/flux2-monitoring-example/issues/32 you can see we only returned "HelmRelease" metrics and the other resource kinds' metrics did not come back.
I did some tests and found, that it's related to the code change of the SanitizeHeaders
function in: #2270 https://github.com/kubernetes/kube-state-metrics/pull/2270/files#diff-60450a33adea08c953656dd1e78a80e9f3b279bbc7656dedf31fd1a0c7fc1196
The issue seems to be in the help: "The current state of a GitOps Toolkit resource."
message. If you make this one unique (ex. different one for HelmRelease, Kustomization, etc.), the metrics do not get removed by the function mentioned above.
I am just not sure if that's a bug or a feature, maybe the author @rexagod knows?
/assign @CatherineF-dev /triage accepted
I did some tests and found, that it's related to the code change of the
SanitizeHeaders
function in: #2270 https://github.com/kubernetes/kube-state-metrics/pull/2270/files#diff-60450a33adea08c953656dd1e78a80e9f3b279bbc7656dedf31fd1a0c7fc1196The issue seems to be in the
help: "The current state of a GitOps Toolkit resource."
message. If you make this one unique (ex. different one for HelmRelease, Kustomization, etc.), the metrics do not get removed by the function mentioned above.I am just not sure if that's a bug or a feature, maybe the author @rexagod knows?
I can confirm. After I changed the "help" fields, the metrics are appeared in Prometheus and Grafana. Thanks @speer !
Hello, apologies for the late response. šš¼
Prometheus' protobuf machinery does not support all OpenMetrics types at the moment (https://github.com/kubernetes/kube-state-metrics/issues/2248). To resolve this, #2270 was merged which implicitly converted stateset
and info
to gauge
metrics, before piping them out (PTAL at these test-cases). This, in turn, gave rise to cases where metrics that were previously seemingly non-conflicting, would potentially start to conflict now, which is why the patch had to include a deduplicating capability, causing the issue raised here as a side-effect.
https://github.com/fluxcd/flux2-monitoring-example/issues/32#issuecomment-2059346695 presents a take on this that has been the implicit sentiment on such configuration scenarios, i.e., if the use-case warrants for different groupVersionKind
definitions, it should ideally be acquainted by different help texts to indicate what changed between them.
I'd be happy to follow this up by pointing out the caveat observed here in the documentation for future instances.
What happened:
The KSM configuration worked well until KSM version v2.10.1. After the upgrade to v2.11.0 the Prometheus reported "invalid metric type" error message. The latest version v2.12.0 solved the "invalid metric type issue", but the required output has been provided only one resource type per metrics. The deployment and configuration not changed during this period.
The issue affected with the "build_info" metric name.
What you expected to happen:
To provide Prometheus output with same metric name and more resource type.
How to reproduce it (as minimally and precisely as possible):
Anything else we need to know?:
Environment:
kubectl version
): 1.28.5