I enabled kublet/cadvisor on my test GCP cluster (v1.29.6-gke1038001) with GCP managed Prometheus (0.12.0-gke.3). Once enabled the following error would repeat when viewing the pods with
{"level":"error","ts":"2024-08-01T22:05:05Z","msg":"poll and update","error":"invalid ClusterNodeMonitoring scrape pool format \"ClusterNodeMonitoring/gmp-kubelet-cadvisor/metrics/cadvisor\"","stacktrace":"github.com/GoogleCloudPlatform/prometheus-engine/pkg/operator.(*targetStatusReconciler).Reconcile\n\t/app/pkg/operator/target_status.go:176\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\n\t/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:119\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:316\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:266\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:227"}
From this error message invalid ClusterNodeMonitoring scrape pool format \"ClusterNodeMonitoring/gmp-kubelet-cadvisor/metrics/cadvisor\" it can be traced to pkg/operator/endpoint_status_builder.go line 166
case "ClusterNodeMonitoring":
if len(split) != 3 {
return scrapePool{}, fmt.Errorf("invalid ClusterNodeMonitoring scrape pool format %q", pool)
}
return getClusterScopedScrapePool(pool, split), nil
I enabled kublet/cadvisor on my test GCP cluster (v1.29.6-gke1038001) with GCP managed Prometheus (0.12.0-gke.3). Once enabled the following error would repeat when viewing the pods with
kubectl logs -f -ngmp-system -lapp.kubernetes.io/part-of=gmp
From this error message
invalid ClusterNodeMonitoring scrape pool format \"ClusterNodeMonitoring/gmp-kubelet-cadvisor/metrics/cadvisor\"
it can be traced topkg/operator/endpoint_status_builder.go
line 166The
gmp-kubelet-cadvisor
ClusterNodeMonitoring in https://github.com/GoogleCloudPlatform/prometheus-engine/blob/v0.12.0/examples/cadvisor-metrics.yaml will always fail this check since the path endpoint (/metrics/cadvisor) will always return 4 parts.This doesn't happen in the similar https://github.com/GoogleCloudPlatform/prometheus-engine/blob/v0.12.0/examples/kubelet-metrics.yaml because the endpoint only has a single forward slash (/metrics)