As per Kubernetes documentation, there are two ways to extend the Kubernetes API: custom resource definitions and the aggregation layer. The custom resource state metrics feature provided by kube-state-metrics is capable of generating metrics for both kinds of custom resources. The bug appears however when there are no custom resource definitions in the Kubernetes API but only custom resources from the aggregation layer. In such case, no metrics for custom resources are generated.
We think the problem might be caused by the CRDiscoverer, because it polls only if a custom resource definition is discovered. A quick workaround is to initialize the CRDiscoverer with the WasUpdated property set to true, but this is probably not the right way to fix the bug:
diff --git a/pkg/app/server.go b/pkg/app/server.go
index 9d25fd0e..ff70aece 100644
--- a/pkg/app/server.go
+++ b/pkg/app/server.go
@@ -308,6 +308,7 @@ func RunKubeStateMetrics(ctx context.Context, opts *options.Options) error {
CRDsAddEventsCounter: crdsAddEventsCounter,
CRDsDeleteEventsCounter: crdsDeleteEventsCounter,
CRDsCacheCountGauge: crdsCacheCountGauge,
+ WasUpdated: true,
}
// This starts a goroutine that will watch for any new GVKs to extract from CRDs.
err = discovererInstance.StartDiscovery(ctx, kubeConfig)
As far as we understand, watching the customresourcedefinitions resource in the apiextensions.k8s.io API group is only required to support wildcard matching. However, this might not be taking custom resources from the aggregation layer into account. We tested wildcard matching for custom resources from the aggregation layer and it did not work for us. It seems kube-state-metrics is only able to generate metrics for these when the group, version and kind are provided in the custom resource state configuration. Note watching the customresourcedefinitions resource might not be necessary in this case because the configuration already contains all information needed to call the API.
We would like to challenge if kube-state-metrics could watch custom resources from the aggregation layer as well. We think this would not only be a better way to fix this bug, but it would also allow the custom resource state feature to work likewise for any custom resource (aggregation layer or custom resource definitions). The /openapi/v2 endpoint could be seen or discussed as an alternative to watch which resources do exist in the Kubernetes API:
kubectl get --raw /openapi/v2 | jq -r '.paths | keys[]'
What you expected to happen:
We would expect metrics for custom resources from the aggregation layer to be exposed as defined in the custom resource state configuration, independently of if there are custom resource definitions or not.
How to reproduce it (as minimally and precisely as possible):
# Execute kube-state-metrics against a Kubernetes cluster without custom resource definitions, and
# use a custom resource state configuration that generates metrics from custom resources from the aggregation layer.
# This should show no resources
kubectl get crd
go run main.go --custom-resource-state-only --custom-resource-state-config-file <custom-resource-config-file.yaml> --kubeconfig $KUBECONFIG
# Check kube-state-metrics and validate the configured custom resource metrics are not generated
curl http://localhost:8080/metrics
# Now apply a dummy custom resource definition, e.g., from the custom resource definition from the Kubernetes documentation
# (in the command below, deleting the selectableFields entry might not be required depending on the Kubernetes version)
curl -s https://raw.githubusercontent.com/kubernetes/website/main/content/en/examples/customresourcedefinition/shirt-resource-definition.yaml | yq 'del(.spec.versions[0].selectableFields)' | kubectl --kubeconfig apply -f -
# Check kube-state-metrics again and validate the configured custom resource metrics appear after a few moments
curl http://localhost:8080/metrics
What happened:
As per Kubernetes documentation, there are two ways to extend the Kubernetes API: custom resource definitions and the aggregation layer. The custom resource state metrics feature provided by kube-state-metrics is capable of generating metrics for both kinds of custom resources. The bug appears however when there are no custom resource definitions in the Kubernetes API but only custom resources from the aggregation layer. In such case, no metrics for custom resources are generated.
We think the problem might be caused by the CRDiscoverer, because it polls only if a custom resource definition is discovered. A quick workaround is to initialize the
CRDiscoverer
with theWasUpdated
property set to true, but this is probably not the right way to fix the bug:As far as we understand, watching the
customresourcedefinitions
resource in theapiextensions.k8s.io
API group is only required to support wildcard matching. However, this might not be taking custom resources from the aggregation layer into account. We tested wildcard matching for custom resources from the aggregation layer and it did not work for us. It seems kube-state-metrics is only able to generate metrics for these when the group, version and kind are provided in the custom resource state configuration. Note watching thecustomresourcedefinitions
resource might not be necessary in this case because the configuration already contains all information needed to call the API.We would like to challenge if kube-state-metrics could watch custom resources from the aggregation layer as well. We think this would not only be a better way to fix this bug, but it would also allow the custom resource state feature to work likewise for any custom resource (aggregation layer or custom resource definitions). The
/openapi/v2
endpoint could be seen or discussed as an alternative to watch which resources do exist in the Kubernetes API:What you expected to happen:
We would expect metrics for custom resources from the aggregation layer to be exposed as defined in the custom resource state configuration, independently of if there are custom resource definitions or not.
How to reproduce it (as minimally and precisely as possible):
Anything else we need to know?:
N/A
Environment:
kubectl version
): 1.30.3