Closed defenestration closed 3 months ago
Could you provide detailed steps on reproducing this issue? You can follow https://github.com/kubernetes/kube-state-metrics/issues/2223#issuecomment-1792850276
The problem seem to be that when Prometheus has --native-histograms
enabled, it tries to negociate for protobuf first and we manually force the text format instead since proto isn't supported: https://github.com/kubernetes/kube-state-metrics/blob/240cffd908220854a27f7e92d8157eaee4dc8d42/pkg/metricshandler/metrics_handler.go#L191-L194.
However, there doesn't seem to be any mechanism in custom-state-metrics to revert back to Prometheus types, so we continue to expose OpenMetrics types whilst the content-type is text.
I think we should rework https://github.com/kubernetes/kube-state-metrics/pull/1930 to enable the OpenMetrics types only when the OpenMetrics format is negociated.
Yes we can definitely improve the negotiation part and contributions that do that are welcome. My current suggestion for to workaround this is not to use Info or StateSet types in your custom config for CRS.
/assign @rexagod /triage accepted
IIUC All unrecognized types are categorized as COUNTERs (also, here), which are validated later on when consuming them in common
, which is where this error pops up from.
As discussed with Damien in a call earlier, the best way forward to bypass the absence of certain OpenMetrics
types in Prometheus, would be to detect if a Proto
negotiation is in effect, and try to specify a text-based format, since owing to the sheer amount of metrics that KSM generates, encoding these into metrics objects to facilitate things for KSM's own protobuf client would increase compute in a way that demands rigorous benchmarking beforehand.
EDIT: This detect-then-enforce approach would keep things stable for dashboards, etc., that have one-or-more use-cases for the HELP
and TYPE
texts. For eg., without this, users may end up seeing gauge
for their info
or stateset
metrics. Additionally, both of these types are used within CRS, and not in native metrics definitions.
I'm wondering we might want to expose the scrape_protocols
API in https://github.com/kubernetes/component-base/tree/master/metrics/prometheus and then bring that here (or introduce this here for now and replace that once we have an impl. in k/k
, since we'll be needing that for https://github.com/kubernetes/kubernetes/pull/119949 anyway.
Ah, scrape_protocols
is a client-side API, I imagine this renders it irrelevant for the use-case here, although, we can still manually set headers from the supported list of text-based formats (as we do right now albeit not with protobufs in consideration).
So essentially, if the requested format is text-based, do nothing, else s/info|stateset/gauge
.
What happened: When kube-state-metrics is configured with custom resources that produce Info and/or StateSet type of metrics, and Prometheus is configured with
--enable-feature=native-histograms
, prometheus is unable to scrape the kube-state-metrics target.Filed an issue with prometheus here https://github.com/prometheus/prometheus/issues/13172 but seems like a possible issue with KSM as well.
What you expected to happen:
Successful scraping of metrics.
How to reproduce it (as minimally and precisely as possible):
Run prometheus with
--enable-feature=native-histograms
and setup kube-state-metrics with a metric of type: Info or Stateset.See a https://github.com/fluxcd/flux2-monitoring-example/issues/14 or https://github.com/defenestration/flux2-monitoring-example/pull/1 for an example setup.
Anything else we need to know?:
With the
native-histograms
feature disabled, prometheus seems to be able to scrape successfully. So a bug is also filed with prometheus.I guess the suggestion would be to include a setting on the custom resource to produce a promehteus-compatible type on the /metrics page for these custom resource types.
Environment:
kubectl version
): v1.27.7+k3s2