Open johnswarbrick-napier opened 1 year ago
I tried explicitly setting the namespaces to discover using:
namespaces:
releaseNamespace: true
additional:
- aml
- ingress-nginx
- kube-system
This has been correctly set in the operator deployment YAML:
spec:
containers:
- args:
- --kubelet-service=kube-system/aml-monitoring-kube-promet-kubelet
- --log-level=debug
- --namespaces=aml-monitoring,aml,ingress-nginx,kube-system
- --localhost=127.0.0.1
- --prometheus-config-reloader=napier.azurecr.io/prometheus-config-reloader:v0.66.0
- --config-reloader-cpu-request=200m
- --config-reloader-cpu-limit=200m
- --config-reloader-memory-request=50Mi
- --config-reloader-memory-limit=50Mi
- --thanos-default-base-image=napier.azurecr.io/thanos/thanos:v0.31.0
- --secret-field-selector=type!=kubernetes.io/dockercfg,type!=kubernetes.io/service-account-token,type!=helm.sh/release.v1
...but Prometheus is still only querying the Kubernetes API for default, kube-system and the installation namespace - it is ignoring the explicitly set namespaces of aml
and ingress-nginx
:
ts=2023-07-04T09:00:15.657Z caller=klog.go:55 level=debug component=k8s_client_runtime func=Verbose.Infof msg="pkg/mod/k8s.io/client-go@v0.26.2/tools/cache/reflector.go:169: Watch close - *v1.Endpoints total 8 items received"
ts=2023-07-04T09:00:15.659Z caller=klog.go:84 level=debug component=k8s_client_runtime func=Infof msg="GET https://10.101.0.1:443/api/v1/namespaces/kube-system/endpoints?allowWatchBookmarks=true&resourceVersion=106613346&timeout=6m50s&timeoutSeconds=410&watch=true 200 OK in 1 milliseconds"
ts=2023-07-04T09:01:04.654Z caller=klog.go:55 level=debug component=k8s_client_runtime func=Verbose.Infof msg="pkg/mod/k8s.io/client-go@v0.26.2/tools/cache/reflector.go:169: Watch close - *v1.Endpoints total 12 items received"
ts=2023-07-04T09:01:04.657Z caller=klog.go:84 level=debug component=k8s_client_runtime func=Infof msg="GET https://10.101.0.1:443/api/v1/namespaces/aml-monitoring/endpoints?allowWatchBookmarks=true&resourceVersion=106613675&timeout=9m2s&timeoutSeconds=542&watch=true 200 OK in 2 milliseconds"
ts=2023-07-04T09:01:55.663Z caller=klog.go:55 level=debug component=k8s_client_runtime func=Verbose.Infof msg="pkg/mod/k8s.io/client-go@v0.26.2/tools/cache/reflector.go:169: Watch close - *v1.Pod total 8 items received"
ts=2023-07-04T09:01:55.668Z caller=klog.go:84 level=debug component=k8s_client_runtime func=Infof msg="GET https://10.101.0.1:443/api/v1/namespaces/aml-monitoring/pods?allowWatchBookmarks=true&resourceVersion=106613994&timeout=8m47s&timeoutSeconds=527&watch=true 200 OK in 4 milliseconds"
ts=2023-07-04T09:02:13.660Z caller=klog.go:55 level=debug component=k8s_client_runtime func=Verbose.Infof msg="pkg/mod/k8s.io/client-go@v0.26.2/tools/cache/reflector.go:169: Watch close - *v1.Endpoints total 8 items received"
ts=2023-07-04T09:02:13.662Z caller=klog.go:84 level=debug component=k8s_client_runtime func=Infof msg="GET https://10.101.0.1:443/api/v1/namespaces/aml-monitoring/endpoints?allowWatchBookmarks=true&resourceVersion=106614105&timeout=8m55s&timeoutSeconds=535&watch=true 200 OK in 2 milliseconds"
ts=2023-07-04T09:04:26.662Z caller=klog.go:55 level=debug component=k8s_client_runtime func=Verbose.Infof msg="pkg/mod/k8s.io/client-go@v0.26.2/tools/cache/reflector.go:169: Watch close - *v1.Pod total 7 items received"
ts=2023-07-04T09:04:26.665Z caller=klog.go:84 level=debug component=k8s_client_runtime func=Infof msg="GET https://10.101.0.1:443/api/v1/namespaces/default/pods?allowWatchBookmarks=true&resourceVersion=106614950&timeout=6m10s&timeoutSeconds=370&watch=true 200 OK in 3 milliseconds"
ts=2023-07-04T09:04:33.657Z caller=klog.go:55 level=debug component=k8s_client_runtime func=Verbose.Infof msg="pkg/mod/k8s.io/client-go@v0.26.2/tools/cache/reflector.go:169: Watch close - *v1.Service total 10 items received"
ts=2023-07-04T09:04:33.660Z caller=klog.go:84 level=debug component=k8s_client_runtime func=Infof msg="GET https://10.101.0.1:443/api/v1/namespaces/default/services?allowWatchBookmarks=true&resourceVersion=106614994&timeout=7m1s&timeoutSeconds=421&watch=true 200 OK in 3 milliseconds"
ts=2023-07-04T09:04:50.663Z caller=klog.go:55 level=debug component=k8s_client_runtime func=Verbose.Infof msg="pkg/mod/k8s.io/client-go@v0.26.2/tools/cache/reflector.go:169: Watch close - *v1.Service total 7 items received"
ts=2023-07-04T09:04:50.666Z caller=klog.go:84 level=debug component=k8s_client_runtime func=Infof msg="GET https://10.101.0.1:443/api/v1/namespaces/aml-monitoring/services?allowWatchBookmarks=true&resourceVersion=106615093&timeout=9m40s&timeoutSeconds=580&watch=true 200 OK in 1 milliseconds"
ts=2023-07-04T09:06:13.663Z caller=klog.go:55 level=debug component=k8s_client_runtime func=Verbose.Infof msg="pkg/mod/k8s.io/client-go@v0.26.2/tools/cache/reflector.go:169: Watch close - *v1.Pod total 7 items received"
ts=2023-07-04T09:06:13.667Z caller=klog.go:84 level=debug component=k8s_client_runtime func=Infof msg="GET https://10.101.0.1:443/api/v1/namespaces/kube-system/pods?allowWatchBookmarks=true&resourceVersion=106615658&timeout=8m45s&timeoutSeconds=525&watch=true 200 OK in 4 milliseconds"
ts=2023-07-04T09:06:15.663Z caller=klog.go:55 level=debug component=k8s_client_runtime func=Verbose.Infof msg="pkg/mod/k8s.io/client-go@v0.26.2/tools/cache/reflector.go:169: Watch close - *v1.Endpoints total 8 items received"
ts=2023-07-04T09:06:15.665Z caller=klog.go:84 level=debug component=k8s_client_runtime func=Infof msg="GET https://10.101.0.1:443/api/v1/namespaces/default/endpoints?allowWatchBookmarks=true&resourceVersion=106615666&timeout=9m27s&timeoutSeconds=567&watch=true 200 OK in 1 milliseconds"
ts=2023-07-04T09:07:05.660Z caller=klog.go:55 level=debug component=k8s_client_runtime func=Verbose.Infof msg="pkg/mod/k8s.io/client-go@v0.26.2/tools/cache/reflector.go:169: Watch close - *v1.Endpoints total 8 items received"
ts=2023-07-04T09:07:05.663Z caller=klog.go:84 level=debug component=k8s_client_runtime func=Infof msg="GET https://10.101.0.1:443/api/v1/namespaces/kube-system/endpoints?allowWatchBookmarks=true&resourceVersion=106615983&timeout=5m53s&timeoutSeconds=353&watch=true 200 OK in 3 milliseconds"
ts=2023-07-04T09:07:33.661Z caller=klog.go:55 level=debug component=k8s_client_runtime func=Verbose.Infof msg="pkg/mod/k8s.io/client-go@v0.26.2/tools/cache/reflector.go:169: Watch close - *v1.Pod total 10 items received"
ts=2023-07-04T09:07:33.666Z caller=klog.go:84 level=debug component=k8s_client_runtime func=Infof msg="GET https://10.101.0.1:443/api/v1/namespaces/aml-monitoring/pods?allowWatchBookmarks=true&resourceVersion=106616160&timeout=7m8s&timeoutSeconds=428&watch=true 200 OK in 4 milliseconds"
ts=2023-07-04T09:07:45.659Z caller=klog.go:55 level=debug component=k8s_client_runtime func=Verbose.Infof msg="pkg/mod/k8s.io/client-go@v0.26.2/tools/cache/reflector.go:169: Watch close - *v1.Service total 9 items received"
ts=2023-07-04T09:07:45.660Z caller=klog.go:84 level=debug component=k8s_client_runtime func=Infof msg="GET https://10.101.0.1:443/api/v1/namespaces/kube-system/services?allowWatchBookmarks=true&resourceVersion=106616253&timeout=8m29s&timeoutSeconds=509&watch=true 200 OK in 1 milliseconds"
I made a small amount of progress - the Prometheus Operator is aware of other namespaces now - although Prometheus still isn't scraping from namespaces other than default, kube-system and [install_namespace].
The same chart versions are working perfectly on Azure with discovery happening across all namespaces, so maybe this is an AWS EKS specific issue, but I've been working on it for two days solid and still can't get it working!
Prometheus Operator logs showing all namespaces selected:
level=debug ts=2023-07-04T18:35:57.037842081Z caller=resource_selector.go:93 component=prometheusoperator msg="filtering namespaces to select ServiceMonitors from" namespaces=default,aml-monitoring,kube-public,tailscale,kube-node-lease,kube-system,aml,ingress-nginx,test-aml namespace=aml-monitoring prometheus=aml-monitoring-kube-promet
level=debug ts=2023-07-04T18:35:57.042876561Z caller=resource_selector.go:334 component=prometheusoperator msg="filtering namespaces to select PodMonitors from" namespaces=kube-node-lease,kube-system,default,aml-monitoring,kube-public,tailscale,ingress-nginx,test-aml,aml namespace=aml-monitoring prometheus=aml-monitoring-kube-promet
level=debug ts=2023-07-04T18:35:57.042927182Z caller=resource_selector.go:464 component=prometheusoperator msg="filtering namespaces to select Probes from" namespaces=kube-node-lease,kube-system,default,aml-monitoring,kube-public,tailscale,ingress nginx,test-aml,aml namespace=aml-monitoring prometheus=aml-monitoring-kube-promet
level=debug ts=2023-07-04T18:35:57.042972863Z caller=resource_selector.go:611 component=prometheusoperator msg="filtering namespaces to select ScrapeConfigs from" namespaces=kube-node-lease,kube-system,default,aml-monitoring,kube-public,tailscale,ingress-nginx,test-aml,aml namespace=aml-monitoring prometheus=aml-monitoring-kube-promet
Prometheus logs showing that it's only making Kubernetes API queries for default, kube-system and [install_namespace] namespaces:
ts=2023-07-04T18:41:24.846Z caller=klog.go:84 level=debug component=k8s_client_runtime func=Infof msg="GET https://10.101.0.1:443/api/v1/namespaces/kube-system/pods?allowWatchBookmarks=true&resourceVersion=106838227&timeout=7m48s&timeoutSeconds=468&watch=true 200 OK in 5 milliseconds"
ts=2023-07-04T18:41:58.842Z caller=klog.go:84 level=debug component=k8s_client_runtime func=Infof msg="GET https://10.101.0.1:443/api/v1/namespaces/aml-monitoring/pods?allowWatchBookmarks=true&resourceVersion=106838480&timeout=5m50s&timeoutSeconds=350&watch=true 200 OK in 2 milliseconds"
ts=2023-07-04T18:42:12.841Z caller=klog.go:84 level=debug component=k8s_client_runtime func=Infof msg="GET https://10.101.0.1:443/api/v1/namespaces/default/endpoints?allowWatchBookmarks=true&resourceVersion=106838559&timeout=9m51s&timeoutSeconds=591&watch=true 200 OK in 3 milliseconds"
ts=2023-07-04T18:42:59.838Z caller=klog.go:84 level=debug component=k8s_client_runtime func=Infof msg="GET https://10.101.0.1:443/api/v1/namespaces/kube-system/endpoints?allowWatchBookmarks=true&resourceVersion=106838893&timeout=9m5s&timeoutSeconds=545&watch=true 200 OK in 4 milliseconds"
ts=2023-07-04T18:43:52.843Z caller=klog.go:84 level=debug component=k8s_client_runtime func=Infof msg="GET https://10.101.0.1:443/api/v1/namespaces/aml-monitoring/services?allowWatchBookmarks=true&resourceVersion=106839244&timeout=6m25s&timeoutSeconds=385&watch=true 200 OK in 1 milliseconds"
ts=2023-07-04T18:44:10.847Z caller=klog.go:84 level=debug component=k8s_client_runtime func=Infof msg="GET https://10.101.0.1:443/api/v1/namespaces/aml-monitoring/endpoints?allowWatchBookmarks=true&resourceVersion=106839367&timeout=8m3s&timeoutSeconds=483&watch=true 200 OK in 5 milliseconds"
ts=2023-07-04T18:45:29.839Z caller=klog.go:84 level=debug component=k8s_client_runtime func=Infof msg="GET https://10.101.0.1:443/api/v1/namespaces/aml-monitoring/services?allowWatchBookmarks=true&resourceVersion=106839851&timeout=7m39s&timeoutSeconds=459&watch=true 200 OK in 3 milliseconds"
ts=2023-07-04T18:46:19.839Z caller=klog.go:84 level=debug component=k8s_client_runtime func=Infof msg="GET https://10.101.0.1:443/api/v1/namespaces/default/pods?allowWatchBookmarks=true&resourceVersion=106840192&timeout=7m11s&timeoutSeconds=431&watch=true 200 OK in 1 milliseconds"
ts=2023-07-04T18:46:47.844Z caller=klog.go:84 level=debug component=k8s_client_runtime func=Infof msg="GET https://10.101.0.1:443/api/v1/namespaces/default/services?allowWatchBookmarks=true&resourceVersion=106840345&timeout=7m0s&timeoutSeconds=420&watch=true 200 OK in 1 milliseconds"
ts=2023-07-04T18:47:48.845Z caller=klog.go:84 level=debug component=k8s_client_runtime func=Infof msg="GET https://10.101.0.1:443/api/v1/namespaces/aml-monitoring/pods?allowWatchBookmarks=true&resourceVersion=106840758&timeout=9m33s&timeoutSeconds=573&watch=true 200 OK in 2 milliseconds"
ts=2023-07-04T18:47:51.852Z caller=klog.go:84 level=debug component=k8s_client_runtime func=Infof msg="GET https://10.101.0.1:443/api/v1/namespaces/aml-monitoring/endpoints?allowWatchBookmarks=true&resourceVersion=106840782&timeout=6m25s&timeoutSeconds=385&watch=true 200 OK in 2 milliseconds"
ts=2023-07-04T18:48:05.845Z caller=klog.go:84 level=debug component=k8s_client_runtime func=Infof msg="GET https://10.101.0.1:443/api/v1/namespaces/kube-system/services?allowWatchBookmarks=true&resourceVersion=106840847&timeout=8m57s&timeoutSeconds=537&watch=true 200 OK in 3 milliseconds"
ts=2023-07-04T18:48:15.840Z caller=klog.go:84 level=debug component=k8s_client_runtime func=Infof msg="GET https://10.101.0.1:443/api/v1/namespaces/aml-monitoring/pods?allowWatchBookmarks=true&resourceVersion=106840921&timeout=7m18s&timeoutSeconds=438&watch=true 200 OK in 1 milliseconds"
ts=2023-07-04T18:49:12.849Z caller=klog.go:84 level=debug component=k8s_client_runtime func=Infof msg="GET https://10.101.0.1:443/api/v1/namespaces/kube-system/pods?allowWatchBookmarks=true&resourceVersion=106841291&timeout=9m41s&timeoutSeconds=581&watch=true 200 OK in 2 milliseconds"
This is the contents of the prometheus.yaml.gz
from the secret prometheus-aml-monitoring-kube-promet
:
global:
evaluation_interval: 30s
scrape_interval: 30s
external_labels:
prometheus: aml-monitoring/aml-monitoring-kube-promet
prometheus_replica: $(POD_NAME)
rule_files:
- /etc/prometheus/rules/prometheus-aml-monitoring-kube-promet-rulefiles-0/*.yaml
scrape_configs:
- job_name: serviceMonitor/aml-monitoring/aml-monitoring-kube-promet-alertmanager/0
honor_labels: false
kubernetes_sd_configs:
- role: endpoints
namespaces:
names:
- aml-monitoring
metrics_path: /metrics
enable_http2: true
relabel_configs:
- source_labels:
- job
target_label: __tmp_prometheus_job_name
- action: keep
source_labels:
- __meta_kubernetes_service_label_app
- __meta_kubernetes_service_labelpresent_app
regex: (kube-prometheus-stack-alertmanager);true
- action: keep
source_labels:
- __meta_kubernetes_service_label_release
- __meta_kubernetes_service_labelpresent_release
regex: (aml-monitoring);true
- action: keep
source_labels:
- __meta_kubernetes_service_label_self_monitor
- __meta_kubernetes_service_labelpresent_self_monitor
regex: (true);true
- action: keep
source_labels:
- __meta_kubernetes_endpoint_port_name
regex: http-web
- source_labels:
- __meta_kubernetes_endpoint_address_target_kind
- __meta_kubernetes_endpoint_address_target_name
separator: ;
regex: Node;(.*)
replacement: ${1}
target_label: node
- source_labels:
- __meta_kubernetes_endpoint_address_target_kind
- __meta_kubernetes_endpoint_address_target_name
separator: ;
regex: Pod;(.*)
replacement: ${1}
target_label: pod
- source_labels:
- __meta_kubernetes_namespace
target_label: namespace
- source_labels:
- __meta_kubernetes_service_name
target_label: service
- source_labels:
- __meta_kubernetes_pod_name
target_label: pod
- source_labels:
- __meta_kubernetes_pod_container_name
target_label: container
- action: drop
source_labels:
- __meta_kubernetes_pod_phase
regex: (Failed|Succeeded)
- source_labels:
- __meta_kubernetes_service_name
target_label: job
replacement: ${1}
- target_label: endpoint
replacement: http-web
- source_labels:
- __address__
target_label: __tmp_hash
modulus: 1
action: hashmod
- source_labels:
- __tmp_hash
regex: $(SHARD)
action: keep
metric_relabel_configs: []
- job_name: serviceMonitor/aml-monitoring/aml-monitoring-kube-promet-apiserver/0
honor_labels: false
kubernetes_sd_configs:
- role: endpoints
namespaces:
names:
- default
scheme: https
tls_config:
insecure_skip_verify: false
server_name: kubernetes
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
relabel_configs:
- source_labels:
- job
target_label: __tmp_prometheus_job_name
- action: keep
source_labels:
- __meta_kubernetes_service_label_component
- __meta_kubernetes_service_labelpresent_component
regex: (apiserver);true
- action: keep
source_labels:
- __meta_kubernetes_service_label_provider
- __meta_kubernetes_service_labelpresent_provider
regex: (kubernetes);true
- action: keep
source_labels:
- __meta_kubernetes_endpoint_port_name
regex: https
- source_labels:
- __meta_kubernetes_endpoint_address_target_kind
- __meta_kubernetes_endpoint_address_target_name
separator: ;
regex: Node;(.*)
replacement: ${1}
target_label: node
- source_labels:
- __meta_kubernetes_endpoint_address_target_kind
- __meta_kubernetes_endpoint_address_target_name
separator: ;
regex: Pod;(.*)
replacement: ${1}
target_label: pod
- source_labels:
- __meta_kubernetes_namespace
target_label: namespace
- source_labels:
- __meta_kubernetes_service_name
target_label: service
- source_labels:
- __meta_kubernetes_pod_name
target_label: pod
- source_labels:
- __meta_kubernetes_pod_container_name
target_label: container
- action: drop
source_labels:
- __meta_kubernetes_pod_phase
regex: (Failed|Succeeded)
- source_labels:
- __meta_kubernetes_service_name
target_label: job
replacement: ${1}
- source_labels:
- __meta_kubernetes_service_label_component
target_label: job
regex: (.+)
replacement: ${1}
- target_label: endpoint
replacement: https
- source_labels:
- __address__
target_label: __tmp_hash
modulus: 1
action: hashmod
- source_labels:
- __tmp_hash
regex: $(SHARD)
action: keep
metric_relabel_configs:
- source_labels:
- __name__
- le
regex: apiserver_request_duration_seconds_bucket;(0.15|0.2|0.3|0.35|0.4|0.45|0.6|0.7|0.8|0.9|1.25|1.5|1.75|2|3|3.5|4|4.5|6|7|8|9|15|25|40|50)
action: drop
- job_name: serviceMonitor/aml-monitoring/aml-monitoring-kube-promet-coredns/0
honor_labels: false
kubernetes_sd_configs:
- role: endpoints
namespaces:
names:
- kube-system
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
relabel_configs:
- source_labels:
- job
target_label: __tmp_prometheus_job_name
- action: keep
source_labels:
- __meta_kubernetes_service_label_app
- __meta_kubernetes_service_labelpresent_app
regex: (kube-prometheus-stack-coredns);true
- action: keep
source_labels:
- __meta_kubernetes_service_label_release
- __meta_kubernetes_service_labelpresent_release
regex: (aml-monitoring);true
- action: keep
source_labels:
- __meta_kubernetes_endpoint_port_name
regex: http-metrics
- source_labels:
- __meta_kubernetes_endpoint_address_target_kind
- __meta_kubernetes_endpoint_address_target_name
separator: ;
regex: Node;(.*)
replacement: ${1}
target_label: node
- source_labels:
- __meta_kubernetes_endpoint_address_target_kind
- __meta_kubernetes_endpoint_address_target_name
separator: ;
regex: Pod;(.*)
replacement: ${1}
target_label: pod
- source_labels:
- __meta_kubernetes_namespace
target_label: namespace
- source_labels:
- __meta_kubernetes_service_name
target_label: service
- source_labels:
- __meta_kubernetes_pod_name
target_label: pod
- source_labels:
- __meta_kubernetes_pod_container_name
target_label: container
- action: drop
source_labels:
- __meta_kubernetes_pod_phase
regex: (Failed|Succeeded)
- source_labels:
- __meta_kubernetes_service_name
target_label: job
replacement: ${1}
- source_labels:
- __meta_kubernetes_service_label_jobLabel
target_label: job
regex: (.+)
replacement: ${1}
- target_label: endpoint
replacement: http-metrics
- source_labels:
- __address__
target_label: __tmp_hash
modulus: 1
action: hashmod
- source_labels:
- __tmp_hash
regex: $(SHARD)
action: keep
metric_relabel_configs: []
- job_name: serviceMonitor/aml-monitoring/aml-monitoring-kube-promet-kube-etcd/0
honor_labels: false
kubernetes_sd_configs:
- role: endpoints
namespaces:
names:
- kube-system
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
relabel_configs:
- source_labels:
- job
target_label: __tmp_prometheus_job_name
- action: keep
source_labels:
- __meta_kubernetes_service_label_app
- __meta_kubernetes_service_labelpresent_app
regex: (kube-prometheus-stack-kube-etcd);true
- action: keep
source_labels:
- __meta_kubernetes_service_label_release
- __meta_kubernetes_service_labelpresent_release
regex: (aml-monitoring);true
- action: keep
source_labels:
- __meta_kubernetes_endpoint_port_name
regex: http-metrics
- source_labels:
- __meta_kubernetes_endpoint_address_target_kind
- __meta_kubernetes_endpoint_address_target_name
separator: ;
regex: Node;(.*)
replacement: ${1}
target_label: node
- source_labels:
- __meta_kubernetes_endpoint_address_target_kind
- __meta_kubernetes_endpoint_address_target_name
separator: ;
regex: Pod;(.*)
replacement: ${1}
target_label: pod
- source_labels:
- __meta_kubernetes_namespace
target_label: namespace
- source_labels:
- __meta_kubernetes_service_name
target_label: service
- source_labels:
- __meta_kubernetes_pod_name
target_label: pod
- source_labels:
- __meta_kubernetes_pod_container_name
target_label: container
- action: drop
source_labels:
- __meta_kubernetes_pod_phase
regex: (Failed|Succeeded)
- source_labels:
- __meta_kubernetes_service_name
target_label: job
replacement: ${1}
- source_labels:
- __meta_kubernetes_service_label_jobLabel
target_label: job
regex: (.+)
replacement: ${1}
- target_label: endpoint
replacement: http-metrics
- source_labels:
- __address__
target_label: __tmp_hash
modulus: 1
action: hashmod
- source_labels:
- __tmp_hash
regex: $(SHARD)
action: keep
metric_relabel_configs: []
- job_name: serviceMonitor/aml-monitoring/aml-monitoring-kube-promet-kubelet/0
honor_labels: true
kubernetes_sd_configs:
- role: endpoints
namespaces:
names:
- kube-system
scheme: https
tls_config:
insecure_skip_verify: true
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
relabel_configs:
- source_labels:
- job
target_label: __tmp_prometheus_job_name
- action: keep
source_labels:
- __meta_kubernetes_service_label_app_kubernetes_io_name
- __meta_kubernetes_service_labelpresent_app_kubernetes_io_name
regex: (kubelet);true
- action: keep
source_labels:
- __meta_kubernetes_service_label_k8s_app
- __meta_kubernetes_service_labelpresent_k8s_app
regex: (kubelet);true
- action: keep
source_labels:
- __meta_kubernetes_endpoint_port_name
regex: https-metrics
- source_labels:
- __meta_kubernetes_endpoint_address_target_kind
- __meta_kubernetes_endpoint_address_target_name
separator: ;
regex: Node;(.*)
replacement: ${1}
target_label: node
- source_labels:
- __meta_kubernetes_endpoint_address_target_kind
- __meta_kubernetes_endpoint_address_target_name
separator: ;
regex: Pod;(.*)
replacement: ${1}
target_label: pod
- source_labels:
- __meta_kubernetes_namespace
target_label: namespace
- source_labels:
- __meta_kubernetes_service_name
target_label: service
- source_labels:
- __meta_kubernetes_pod_name
target_label: pod
- source_labels:
- __meta_kubernetes_pod_container_name
target_label: container
- action: drop
source_labels:
- __meta_kubernetes_pod_phase
regex: (Failed|Succeeded)
- source_labels:
- __meta_kubernetes_service_name
target_label: job
replacement: ${1}
- source_labels:
- __meta_kubernetes_service_label_k8s_app
target_label: job
regex: (.+)
replacement: ${1}
- target_label: endpoint
replacement: https-metrics
- source_labels:
- __metrics_path__
target_label: metrics_path
action: replace
- source_labels:
- __address__
target_label: __tmp_hash
modulus: 1
action: hashmod
- source_labels:
- __tmp_hash
regex: $(SHARD)
action: keep
metric_relabel_configs: []
- job_name: serviceMonitor/aml-monitoring/aml-monitoring-kube-promet-kubelet/1
honor_labels: true
kubernetes_sd_configs:
- role: endpoints
namespaces:
names:
- kube-system
metrics_path: /metrics/cadvisor
scheme: https
tls_config:
insecure_skip_verify: true
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
relabel_configs:
- source_labels:
- job
target_label: __tmp_prometheus_job_name
- action: keep
source_labels:
- __meta_kubernetes_service_label_app_kubernetes_io_name
- __meta_kubernetes_service_labelpresent_app_kubernetes_io_name
regex: (kubelet);true
- action: keep
source_labels:
- __meta_kubernetes_service_label_k8s_app
- __meta_kubernetes_service_labelpresent_k8s_app
regex: (kubelet);true
- action: keep
source_labels:
- __meta_kubernetes_endpoint_port_name
regex: https-metrics
- source_labels:
- __meta_kubernetes_endpoint_address_target_kind
- __meta_kubernetes_endpoint_address_target_name
separator: ;
regex: Node;(.*)
replacement: ${1}
target_label: node
- source_labels:
- __meta_kubernetes_endpoint_address_target_kind
- __meta_kubernetes_endpoint_address_target_name
separator: ;
regex: Pod;(.*)
replacement: ${1}
target_label: pod
- source_labels:
- __meta_kubernetes_namespace
target_label: namespace
- source_labels:
- __meta_kubernetes_service_name
target_label: service
- source_labels:
- __meta_kubernetes_pod_name
target_label: pod
- source_labels:
- __meta_kubernetes_pod_container_name
target_label: container
- action: drop
source_labels:
- __meta_kubernetes_pod_phase
regex: (Failed|Succeeded)
- source_labels:
- __meta_kubernetes_service_name
target_label: job
replacement: ${1}
- source_labels:
- __meta_kubernetes_service_label_k8s_app
target_label: job
regex: (.+)
replacement: ${1}
- target_label: endpoint
replacement: https-metrics
- source_labels:
- __metrics_path__
target_label: metrics_path
action: replace
- source_labels:
- __address__
target_label: __tmp_hash
modulus: 1
action: hashmod
- source_labels:
- __tmp_hash
regex: $(SHARD)
action: keep
metric_relabel_configs:
- source_labels:
- __name__
regex: container_cpu_(cfs_throttled_seconds_total|load_average_10s|system_seconds_total|user_seconds_total)
action: drop
- source_labels:
- __name__
regex: container_fs_(io_current|io_time_seconds_total|io_time_weighted_seconds_total|reads_merged_total|sector_reads_total|sector_writes_total|writes_merged_total)
action: drop
- source_labels:
- __name__
regex: container_memory_(mapped_file|swap)
action: drop
- source_labels:
- __name__
regex: container_(file_descriptors|tasks_state|threads_max)
action: drop
- source_labels:
- __name__
regex: container_spec.*
action: drop
- source_labels:
- id
- pod
regex: .+;
action: drop
- job_name: serviceMonitor/aml-monitoring/aml-monitoring-kube-promet-kubelet/2
honor_labels: true
kubernetes_sd_configs:
- role: endpoints
namespaces:
names:
- kube-system
metrics_path: /metrics/probes
scheme: https
tls_config:
insecure_skip_verify: true
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
relabel_configs:
- source_labels:
- job
target_label: __tmp_prometheus_job_name
- action: keep
source_labels:
- __meta_kubernetes_service_label_app_kubernetes_io_name
- __meta_kubernetes_service_labelpresent_app_kubernetes_io_name
regex: (kubelet);true
- action: keep
source_labels:
- __meta_kubernetes_service_label_k8s_app
- __meta_kubernetes_service_labelpresent_k8s_app
regex: (kubelet);true
- action: keep
source_labels:
- __meta_kubernetes_endpoint_port_name
regex: https-metrics
- source_labels:
- __meta_kubernetes_endpoint_address_target_kind
- __meta_kubernetes_endpoint_address_target_name
separator: ;
regex: Node;(.*)
replacement: ${1}
target_label: node
- source_labels:
- __meta_kubernetes_endpoint_address_target_kind
- __meta_kubernetes_endpoint_address_target_name
separator: ;
regex: Pod;(.*)
replacement: ${1}
target_label: pod
- source_labels:
- __meta_kubernetes_namespace
target_label: namespace
- source_labels:
- __meta_kubernetes_service_name
target_label: service
- source_labels:
- __meta_kubernetes_pod_name
target_label: pod
- source_labels:
- __meta_kubernetes_pod_container_name
target_label: container
- action: drop
source_labels:
- __meta_kubernetes_pod_phase
regex: (Failed|Succeeded)
- source_labels:
- __meta_kubernetes_service_name
target_label: job
replacement: ${1}
- source_labels:
- __meta_kubernetes_service_label_k8s_app
target_label: job
regex: (.+)
replacement: ${1}
- target_label: endpoint
replacement: https-metrics
- source_labels:
- __metrics_path__
target_label: metrics_path
action: replace
- source_labels:
- __address__
target_label: __tmp_hash
modulus: 1
action: hashmod
- source_labels:
- __tmp_hash
regex: $(SHARD)
action: keep
metric_relabel_configs: []
- job_name: serviceMonitor/aml-monitoring/aml-monitoring-kube-promet-operator/0
honor_labels: true
kubernetes_sd_configs:
- role: endpoints
namespaces:
names:
- aml-monitoring
scheme: https
tls_config:
insecure_skip_verify: false
ca_file: /etc/prometheus/certs/secret_aml-monitoring_aml-monitoring-kube-promet-admission_ca
server_name: aml-monitoring-kube-promet-operator
relabel_configs:
- source_labels:
- job
target_label: __tmp_prometheus_job_name
- action: keep
source_labels:
- __meta_kubernetes_service_label_app
- __meta_kubernetes_service_labelpresent_app
regex: (kube-prometheus-stack-operator);true
- action: keep
source_labels:
- __meta_kubernetes_service_label_release
- __meta_kubernetes_service_labelpresent_release
regex: (aml-monitoring);true
- action: keep
source_labels:
- __meta_kubernetes_endpoint_port_name
regex: https
- source_labels:
- __meta_kubernetes_endpoint_address_target_kind
- __meta_kubernetes_endpoint_address_target_name
separator: ;
regex: Node;(.*)
replacement: ${1}
target_label: node
- source_labels:
- __meta_kubernetes_endpoint_address_target_kind
- __meta_kubernetes_endpoint_address_target_name
separator: ;
regex: Pod;(.*)
replacement: ${1}
target_label: pod
- source_labels:
- __meta_kubernetes_namespace
target_label: namespace
- source_labels:
- __meta_kubernetes_service_name
target_label: service
- source_labels:
- __meta_kubernetes_pod_name
target_label: pod
- source_labels:
- __meta_kubernetes_pod_container_name
target_label: container
- action: drop
source_labels:
- __meta_kubernetes_pod_phase
regex: (Failed|Succeeded)
- source_labels:
- __meta_kubernetes_service_name
target_label: job
replacement: ${1}
- target_label: endpoint
replacement: https
- source_labels:
- __address__
target_label: __tmp_hash
modulus: 1
action: hashmod
- source_labels:
- __tmp_hash
regex: $(SHARD)
action: keep
metric_relabel_configs: []
- job_name: serviceMonitor/aml-monitoring/aml-monitoring-kube-promet-prometheus/0
honor_labels: false
kubernetes_sd_configs:
- role: endpoints
namespaces:
names:
- aml-monitoring
metrics_path: /metrics
relabel_configs:
- source_labels:
- job
target_label: __tmp_prometheus_job_name
- action: keep
source_labels:
- __meta_kubernetes_service_label_app
- __meta_kubernetes_service_labelpresent_app
regex: (kube-prometheus-stack-prometheus);true
- action: keep
source_labels:
- __meta_kubernetes_service_label_release
- __meta_kubernetes_service_labelpresent_release
regex: (aml-monitoring);true
- action: keep
source_labels:
- __meta_kubernetes_service_label_self_monitor
- __meta_kubernetes_service_labelpresent_self_monitor
regex: (true);true
- action: keep
source_labels:
- __meta_kubernetes_endpoint_port_name
regex: http-web
- source_labels:
- __meta_kubernetes_endpoint_address_target_kind
- __meta_kubernetes_endpoint_address_target_name
separator: ;
regex: Node;(.*)
replacement: ${1}
target_label: node
- source_labels:
- __meta_kubernetes_endpoint_address_target_kind
- __meta_kubernetes_endpoint_address_target_name
separator: ;
regex: Pod;(.*)
replacement: ${1}
target_label: pod
- source_labels:
- __meta_kubernetes_namespace
target_label: namespace
- source_labels:
- __meta_kubernetes_service_name
target_label: service
- source_labels:
- __meta_kubernetes_pod_name
target_label: pod
- source_labels:
- __meta_kubernetes_pod_container_name
target_label: container
- action: drop
source_labels:
- __meta_kubernetes_pod_phase
regex: (Failed|Succeeded)
- source_labels:
- __meta_kubernetes_service_name
target_label: job
replacement: ${1}
- target_label: endpoint
replacement: http-web
- source_labels:
- __address__
target_label: __tmp_hash
modulus: 1
action: hashmod
- source_labels:
- __tmp_hash
regex: $(SHARD)
action: keep
metric_relabel_configs: []
- job_name: serviceMonitor/aml-monitoring/aml-monitoring-kube-state-metrics/0
honor_labels: true
kubernetes_sd_configs:
- role: endpoints
namespaces:
names:
- aml-monitoring
relabel_configs:
- source_labels:
- job
target_label: __tmp_prometheus_job_name
- action: keep
source_labels:
- __meta_kubernetes_service_label_app_kubernetes_io_instance
- __meta_kubernetes_service_labelpresent_app_kubernetes_io_instance
regex: (aml-monitoring);true
- action: keep
source_labels:
- __meta_kubernetes_service_label_app_kubernetes_io_name
- __meta_kubernetes_service_labelpresent_app_kubernetes_io_name
regex: (kube-state-metrics);true
- action: keep
source_labels:
- __meta_kubernetes_endpoint_port_name
regex: http
- source_labels:
- __meta_kubernetes_endpoint_address_target_kind
- __meta_kubernetes_endpoint_address_target_name
separator: ;
regex: Node;(.*)
replacement: ${1}
target_label: node
- source_labels:
- __meta_kubernetes_endpoint_address_target_kind
- __meta_kubernetes_endpoint_address_target_name
separator: ;
regex: Pod;(.*)
replacement: ${1}
target_label: pod
- source_labels:
- __meta_kubernetes_namespace
target_label: namespace
- source_labels:
- __meta_kubernetes_service_name
target_label: service
- source_labels:
- __meta_kubernetes_pod_name
target_label: pod
- source_labels:
- __meta_kubernetes_pod_container_name
target_label: container
- action: drop
source_labels:
- __meta_kubernetes_pod_phase
regex: (Failed|Succeeded)
- source_labels:
- __meta_kubernetes_service_name
target_label: job
replacement: ${1}
- source_labels:
- __meta_kubernetes_service_label_app_kubernetes_io_name
target_label: job
regex: (.+)
replacement: ${1}
- target_label: endpoint
replacement: http
- source_labels:
- __address__
target_label: __tmp_hash
modulus: 1
action: hashmod
- source_labels:
- __tmp_hash
regex: $(SHARD)
action: keep
metric_relabel_configs: []
- job_name: serviceMonitor/aml-monitoring/aml-monitoring-loki/0
honor_labels: false
kubernetes_sd_configs:
- role: endpoints
namespaces:
names:
- aml-monitoring
scrape_interval: 15s
metrics_path: /metrics
scheme: http
relabel_configs:
- source_labels:
- job
target_label: __tmp_prometheus_job_name
- action: keep
source_labels:
- __meta_kubernetes_service_label_app_kubernetes_io_instance
- __meta_kubernetes_service_labelpresent_app_kubernetes_io_instance
regex: (aml-monitoring);true
- action: keep
source_labels:
- __meta_kubernetes_service_label_app_kubernetes_io_name
- __meta_kubernetes_service_labelpresent_app_kubernetes_io_name
regex: (loki);true
- action: drop
source_labels:
- __meta_kubernetes_service_label_prometheus_io_service_monitor
- __meta_kubernetes_service_labelpresent_prometheus_io_service_monitor
regex: (false);true
- action: keep
source_labels:
- __meta_kubernetes_endpoint_port_name
regex: http-metrics
- source_labels:
- __meta_kubernetes_endpoint_address_target_kind
- __meta_kubernetes_endpoint_address_target_name
separator: ;
regex: Node;(.*)
replacement: ${1}
target_label: node
- source_labels:
- __meta_kubernetes_endpoint_address_target_kind
- __meta_kubernetes_endpoint_address_target_name
separator: ;
regex: Pod;(.*)
replacement: ${1}
target_label: pod
- source_labels:
- __meta_kubernetes_namespace
target_label: namespace
- source_labels:
- __meta_kubernetes_service_name
target_label: service
- source_labels:
- __meta_kubernetes_pod_name
target_label: pod
- source_labels:
- __meta_kubernetes_pod_container_name
target_label: container
- action: drop
source_labels:
- __meta_kubernetes_pod_phase
regex: (Failed|Succeeded)
- source_labels:
- __meta_kubernetes_service_name
target_label: job
replacement: ${1}
- target_label: endpoint
replacement: http-metrics
- source_labels:
- job
target_label: job
replacement: aml-monitoring/$1
action: replace
- target_label: cluster
replacement: aml-monitoring-loki
action: replace
- source_labels:
- __address__
target_label: __tmp_hash
modulus: 1
action: hashmod
- source_labels:
- __tmp_hash
regex: $(SHARD)
action: keep
metric_relabel_configs: []
- job_name: serviceMonitor/aml-monitoring/aml-monitoring-prometheus-node-exporter/0
honor_labels: false
kubernetes_sd_configs:
- role: endpoints
namespaces:
names:
- aml-monitoring
attach_metadata:
node: false
scheme: http
relabel_configs:
- source_labels:
- job
target_label: __tmp_prometheus_job_name
- action: keep
source_labels:
- __meta_kubernetes_service_label_app_kubernetes_io_instance
- __meta_kubernetes_service_labelpresent_app_kubernetes_io_instance
regex: (aml-monitoring);true
- action: keep
source_labels:
- __meta_kubernetes_service_label_app_kubernetes_io_name
- __meta_kubernetes_service_labelpresent_app_kubernetes_io_name
regex: (prometheus-node-exporter);true
- action: keep
source_labels:
- __meta_kubernetes_endpoint_port_name
regex: http-metrics
- source_labels:
- __meta_kubernetes_endpoint_address_target_kind
- __meta_kubernetes_endpoint_address_target_name
separator: ;
regex: Node;(.*)
replacement: ${1}
target_label: node
- source_labels:
- __meta_kubernetes_endpoint_address_target_kind
- __meta_kubernetes_endpoint_address_target_name
separator: ;
regex: Pod;(.*)
replacement: ${1}
target_label: pod
- source_labels:
- __meta_kubernetes_namespace
target_label: namespace
- source_labels:
- __meta_kubernetes_service_name
target_label: service
- source_labels:
- __meta_kubernetes_pod_name
target_label: pod
- source_labels:
- __meta_kubernetes_pod_container_name
target_label: container
- action: drop
source_labels:
- __meta_kubernetes_pod_phase
regex: (Failed|Succeeded)
- source_labels:
- __meta_kubernetes_service_name
target_label: job
replacement: ${1}
- source_labels:
- __meta_kubernetes_service_label_jobLabel
target_label: job
regex: (.+)
replacement: ${1}
- target_label: endpoint
replacement: http-metrics
- source_labels:
- __address__
target_label: __tmp_hash
modulus: 1
action: hashmod
- source_labels:
- __tmp_hash
regex: $(SHARD)
action: keep
metric_relabel_configs: []
storage:
tsdb:
out_of_order_time_window: 0s
alerting:
alert_relabel_configs:
- action: labeldrop
regex: prometheus_replica
alertmanagers:
- path_prefix: /
scheme: http
kubernetes_sd_configs:
- role: endpoints
namespaces:
names:
- aml-monitoring
api_version: v2
relabel_configs:
- action: keep
source_labels:
- __meta_kubernetes_service_name
regex: aml-monitoring-kube-promet-alertmanager
- action: keep
source_labels:
- __meta_kubernetes_endpoint_port_name
regex: http-web
I think the problem is that namespace filtering is broken.
kube-prometheus-stack is installed into namespace aml-monitoring
.
The following config options are set, where I'm explicitly forcing scaping of namespaces aml
and tailscale
:
prometheusOperator:
namespaces:
releaseNamespace: true
additional:
- aml
- tailscale
prometheus:
prometheusSpec:
podMonitorSelectorNilUsesHelmValues: false
serviceMonitorSelectorNilUsesHelmValues: false
The Prometheus Operator logs indicate that it sees all namespaces during filtering, including the target aml
and tailscale
:
level=debug ts=2023-07-04T22:14:54.695699749Z caller=resource_selector.go:93 component=prometheusoperator msg="filtering namespaces to select ServiceMonitors from" namespaces=kube-public,kube-system,default,ingress-nginx,test-aml,aml,aml-monitoring,kube-node-lease,tailscale namespace=aml-monitoring prometheus=aml-monitoring-kube-promet
level=debug ts=2023-07-04T22:14:54.702177763Z caller=resource_selector.go:334 component=prometheusoperator msg="filtering namespaces to select PodMonitors from" namespaces=tailscale,aml,aml-monitoring,kube-node-lease,test-aml,kube-public,kube-system,default,ingress-nginx namespace=aml-monitoring prometheus=aml-monitoring-kube-promet
...but it then only ever selects the installation namespace, aml-monitoring
- it completely ignores the target namespaces:
level=debug ts=2023-07-04T22:14:54.702152062Z caller=resource_selector.go:191 component=prometheusoperator msg="selected ServiceMonitors" servicemonitors=aml-monitoring/aml-monitoring-kube-promet-prometheus,aml-monitoring/aml-monitoring-kube-state-metrics,aml-monitoring/aml-monitoring-loki,aml-monitoring/aml-monitoring-kube-promet-operator,aml-monitoring/aml-monitoring-kube-promet-coredns,aml-monitoring/aml-monitoring-kube-promet-kubelet,aml-monitoring/aml-monitoring-kube-promet-kube-etcd,aml-monitoring/aml-monitoring-kube-promet-apiserver,aml-monitoring/aml-monitoring-prometheus-node-exporter,aml-monitoring/aml-monitoring-kube-promet-alertmanager namespace=aml-monitoring prometheus=aml-monitoring-kube-promet
level=debug ts=2023-07-04T22:14:54.702200793Z caller=resource_selector.go:424 component=prometheusoperator msg="selected PodMonitors" podmonitors= namespace=aml-monitoring prometheus=aml-monitoring-kube-promet
This results in the Prometheus Operator never adding podMonitors and serviceMonitors from those additional namespaces to the prometheus.yaml.gz in the Prometheus secret, therefore the podMonitors and serviceMonitors are never scraped by Prometheus.
The same config appears to work fine in Azure. Maybe this is an EKS permissions issue with a KubeAPI query silently failing and not being logged? I cannot find any access denied or other errors anywhere in the cluster though.
@cccsss01 - I think you experienced the same in https://github.com/prometheus-community/helm-charts/issues/3410
@sebastianlutter - https://github.com/prometheus-community/helm-charts/issues/3487 seems exactly the same - how did you fix this?
Looks like https://github.com/prometheus-community/helm-charts/issues/2323 is similar.
@arpitjindal97 and @arpitjindal97 - what am I missing here??
I modified my configuration to:
kube-prometheus-stack:
prometheusOperator:
namespaces:
releaseNamespace: true
additional:
- ingress-nginx
- aml
- kube-system
prometheus:
prometheusSpec:
podMonitorSelectorNilUsesHelmValues: false
podMonitorNamespaceSelector:
matchExpressions:
- key: kubernetes.io/metadata.name
operator: Exists
serviceMonitorSelectorNilUsesHelmValues: false
serviceMonitorSelector: {}
serviceMonitorNamespaceSelector:
matchExpressions:
- key: kubernetes.io/metadata.name
operator: Exists
...to match the output of kubectl get ns/aml -o json
{
"apiVersion": "v1",
"kind": "Namespace",
"metadata": {
"creationTimestamp": "2023-02-01T11:48:14Z",
"labels": {
"kubernetes.io/metadata.name": "aml"
},
"name": "aml",
"resourceVersion": "16264136",
"uid": "6173158f-3202-4fc4-bebc-d7b2ea4ba36a"
},
"spec": {
"finalizers": [
"kubernetes"
]
},
"status": {
"phase": "Active"
}
}
But it's still not picking up any podmonitors in namespaces outside the installation namespace:
level=debug ts=2023-07-05T03:22:22.492509284Z caller=resource_selector.go:93 component=prometheusoperator msg="filtering namespaces to select ServiceMonitors from" namespaces=aml,kube-system,aml-monitoring,ingress-nginx namespace=aml-monitoring prometheus=aml-monitoring-kube-promet-prometheus
level=debug ts=2023-07-05T03:22:22.496242835Z caller=klog.go:84 component=k8s_client_runtime func=Infof msg="GET https://10.101.0.1:443/api/v1/namespaces/aml-monitoring/secrets/aml-monitoring-kube-promet-admission 200 OK in 3 milliseconds"
level=debug ts=2023-07-05T03:22:22.497143299Z caller=resource_selector.go:191 component=prometheusoperator msg="selected ServiceMonitors" servicemonitors=aml-monitoring/aml-monitoring-kube-promet-kube-etcd,aml-monitoring/aml-monitoring-kube-promet-coredns,aml-monitoring/aml-monitoring-kube-promet-operator,aml-monitoring/aml-monitoring-kube-state-metrics,aml-monitoring/aml-monitoring-kube-promet-kubelet,aml-monitoring/aml-monitoring-loki,aml-monitoring/aml-monitoring-kube-promet-alertmanager,aml-monitoring/aml-monitoring-kube-promet-prometheus,aml-monitoring/aml-monitoring-prometheus-node-exporter,aml-monitoring/aml-monitoring-kube-promet-apiserver namespace=aml-monitoring prometheus=aml-monitoring-kube-promet-prometheus
level=debug ts=2023-07-05T03:22:22.497199411Z caller=resource_selector.go:334 component=prometheusoperator msg="filtering namespaces to select PodMonitors from" namespaces=kube-system,aml-monitoring,ingress-nginx,aml namespace=aml-monitoring prometheus=aml-monitoring-kube-promet-prometheus
level=debug ts=2023-07-05T03:22:22.497229962Z caller=resource_selector.go:424 component=prometheusoperator msg="selected PodMonitors" podmonitors= namespace=aml-monitoring prometheus=aml-monitoring-kube-promet-prometheus
The Prometheus config is unchanged and only has the installation namespace:
global:
evaluation_interval: 30s
scrape_interval: 30s
external_labels:
prometheus: aml-monitoring/aml-monitoring-kube-promet-prometheus
prometheus_replica: $(POD_NAME)
rule_files:
- /etc/prometheus/rules/prometheus-aml-monitoring-kube-promet-prometheus-rulefiles-0/*.yaml
scrape_configs:
- job_name: serviceMonitor/aml-monitoring/aml-monitoring-kube-promet-alertmanager/0
honor_labels: false
kubernetes_sd_configs:
- role: endpoints
namespaces:
names:
- aml-monitoring
metrics_path: /metrics
enable_http2: true
relabel_configs:
- source_labels:
- job
target_label: __tmp_prometheus_job_name
- action: keep
source_labels:
- __meta_kubernetes_service_label_app
- __meta_kubernetes_service_labelpresent_app
regex: (kube-prometheus-stack-alertmanager);true
- action: keep
source_labels:
- __meta_kubernetes_service_label_release
- __meta_kubernetes_service_labelpresent_release
regex: (aml-monitoring);true
- action: keep
source_labels:
- __meta_kubernetes_service_label_self_monitor
- __meta_kubernetes_service_labelpresent_self_monitor
regex: (true);true
- action: keep
source_labels:
- __meta_kubernetes_endpoint_port_name
regex: http-web
- source_labels:
- __meta_kubernetes_endpoint_address_target_kind
- __meta_kubernetes_endpoint_address_target_name
separator: ;
regex: Node;(.*)
replacement: ${1}
target_label: node
- source_labels:
- __meta_kubernetes_endpoint_address_target_kind
- __meta_kubernetes_endpoint_address_target_name
separator: ;
regex: Pod;(.*)
replacement: ${1}
target_label: pod
- source_labels:
- __meta_kubernetes_namespace
target_label: namespace
- source_labels:
- __meta_kubernetes_service_name
target_label: service
- source_labels:
- __meta_kubernetes_pod_name
target_label: pod
- source_labels:
- __meta_kubernetes_pod_container_name
target_label: container
- action: drop
source_labels:
- __meta_kubernetes_pod_phase
regex: (Failed|Succeeded)
- source_labels:
- __meta_kubernetes_service_name
target_label: job
replacement: ${1}
- target_label: endpoint
replacement: http-web
- source_labels:
- __address__
target_label: __tmp_hash
modulus: 1
action: hashmod
- source_labels:
- __tmp_hash
regex: $(SHARD)
action: keep
metric_relabel_configs: []
- job_name: serviceMonitor/aml-monitoring/aml-monitoring-kube-promet-alertmanager/1
honor_labels: false
kubernetes_sd_configs:
- role: endpoints
namespaces:
names:
- aml-monitoring
metrics_path: /metrics
relabel_configs:
- source_labels:
- job
target_label: __tmp_prometheus_job_name
- action: keep
source_labels:
- __meta_kubernetes_service_label_app
- __meta_kubernetes_service_labelpresent_app
regex: (kube-prometheus-stack-alertmanager);true
- action: keep
source_labels:
- __meta_kubernetes_service_label_release
- __meta_kubernetes_service_labelpresent_release
regex: (aml-monitoring);true
- action: keep
source_labels:
- __meta_kubernetes_service_label_self_monitor
- __meta_kubernetes_service_labelpresent_self_monitor
regex: (true);true
- action: keep
source_labels:
- __meta_kubernetes_endpoint_port_name
regex: reloader-web
- source_labels:
- __meta_kubernetes_endpoint_address_target_kind
- __meta_kubernetes_endpoint_address_target_name
separator: ;
regex: Node;(.*)
replacement: ${1}
target_label: node
- source_labels:
- __meta_kubernetes_endpoint_address_target_kind
- __meta_kubernetes_endpoint_address_target_name
separator: ;
regex: Pod;(.*)
replacement: ${1}
target_label: pod
- source_labels:
- __meta_kubernetes_namespace
target_label: namespace
- source_labels:
- __meta_kubernetes_service_name
target_label: service
- source_labels:
- __meta_kubernetes_pod_name
target_label: pod
- source_labels:
- __meta_kubernetes_pod_container_name
target_label: container
- action: drop
source_labels:
- __meta_kubernetes_pod_phase
regex: (Failed|Succeeded)
- source_labels:
- __meta_kubernetes_service_name
target_label: job
replacement: ${1}
- target_label: endpoint
replacement: reloader-web
- source_labels:
- __address__
target_label: __tmp_hash
modulus: 1
action: hashmod
- source_labels:
- __tmp_hash
regex: $(SHARD)
action: keep
metric_relabel_configs: []
- job_name: serviceMonitor/aml-monitoring/aml-monitoring-kube-promet-apiserver/0
honor_labels: false
kubernetes_sd_configs:
- role: endpoints
namespaces:
names:
- default
scheme: https
tls_config:
insecure_skip_verify: false
server_name: kubernetes
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
relabel_configs:
- source_labels:
- job
target_label: __tmp_prometheus_job_name
- action: keep
source_labels:
- __meta_kubernetes_service_label_component
- __meta_kubernetes_service_labelpresent_component
regex: (apiserver);true
- action: keep
source_labels:
- __meta_kubernetes_service_label_provider
- __meta_kubernetes_service_labelpresent_provider
regex: (kubernetes);true
- action: keep
source_labels:
- __meta_kubernetes_endpoint_port_name
regex: https
- source_labels:
- __meta_kubernetes_endpoint_address_target_kind
- __meta_kubernetes_endpoint_address_target_name
separator: ;
regex: Node;(.*)
replacement: ${1}
target_label: node
- source_labels:
- __meta_kubernetes_endpoint_address_target_kind
- __meta_kubernetes_endpoint_address_target_name
separator: ;
regex: Pod;(.*)
replacement: ${1}
target_label: pod
- source_labels:
- __meta_kubernetes_namespace
target_label: namespace
- source_labels:
- __meta_kubernetes_service_name
target_label: service
- source_labels:
- __meta_kubernetes_pod_name
target_label: pod
- source_labels:
- __meta_kubernetes_pod_container_name
target_label: container
- action: drop
source_labels:
- __meta_kubernetes_pod_phase
regex: (Failed|Succeeded)
- source_labels:
- __meta_kubernetes_service_name
target_label: job
replacement: ${1}
- source_labels:
- __meta_kubernetes_service_label_component
target_label: job
regex: (.+)
replacement: ${1}
- target_label: endpoint
replacement: https
- source_labels:
- __address__
target_label: __tmp_hash
modulus: 1
action: hashmod
- source_labels:
- __tmp_hash
regex: $(SHARD)
action: keep
metric_relabel_configs:
- source_labels:
- __name__
- le
regex: apiserver_request_duration_seconds_bucket;(0.15|0.2|0.3|0.35|0.4|0.45|0.6|0.7|0.8|0.9|1.25|1.5|1.75|2|3|3.5|4|4.5|6|7|8|9|15|25|40|50)
action: drop
- job_name: serviceMonitor/aml-monitoring/aml-monitoring-kube-promet-coredns/0
honor_labels: false
kubernetes_sd_configs:
- role: endpoints
namespaces:
names:
- kube-system
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
relabel_configs:
- source_labels:
- job
target_label: __tmp_prometheus_job_name
- action: keep
source_labels:
- __meta_kubernetes_service_label_app
- __meta_kubernetes_service_labelpresent_app
regex: (kube-prometheus-stack-coredns);true
- action: keep
source_labels:
- __meta_kubernetes_service_label_release
- __meta_kubernetes_service_labelpresent_release
regex: (aml-monitoring);true
- action: keep
source_labels:
- __meta_kubernetes_endpoint_port_name
regex: http-metrics
- source_labels:
- __meta_kubernetes_endpoint_address_target_kind
- __meta_kubernetes_endpoint_address_target_name
separator: ;
regex: Node;(.*)
replacement: ${1}
target_label: node
- source_labels:
- __meta_kubernetes_endpoint_address_target_kind
- __meta_kubernetes_endpoint_address_target_name
separator: ;
regex: Pod;(.*)
replacement: ${1}
target_label: pod
- source_labels:
- __meta_kubernetes_namespace
target_label: namespace
- source_labels:
- __meta_kubernetes_service_name
target_label: service
- source_labels:
- __meta_kubernetes_pod_name
target_label: pod
- source_labels:
- __meta_kubernetes_pod_container_name
target_label: container
- action: drop
source_labels:
- __meta_kubernetes_pod_phase
regex: (Failed|Succeeded)
- source_labels:
- __meta_kubernetes_service_name
target_label: job
replacement: ${1}
- source_labels:
- __meta_kubernetes_service_label_jobLabel
target_label: job
regex: (.+)
replacement: ${1}
- target_label: endpoint
replacement: http-metrics
- source_labels:
- __address__
target_label: __tmp_hash
modulus: 1
action: hashmod
- source_labels:
- __tmp_hash
regex: $(SHARD)
action: keep
metric_relabel_configs: []
- job_name: serviceMonitor/aml-monitoring/aml-monitoring-kube-promet-kube-etcd/0
honor_labels: false
kubernetes_sd_configs:
- role: endpoints
namespaces:
names:
- kube-system
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
relabel_configs:
- source_labels:
- job
target_label: __tmp_prometheus_job_name
- action: keep
source_labels:
- __meta_kubernetes_service_label_app
- __meta_kubernetes_service_labelpresent_app
regex: (kube-prometheus-stack-kube-etcd);true
- action: keep
source_labels:
- __meta_kubernetes_service_label_release
- __meta_kubernetes_service_labelpresent_release
regex: (aml-monitoring);true
- action: keep
source_labels:
- __meta_kubernetes_endpoint_port_name
regex: http-metrics
- source_labels:
- __meta_kubernetes_endpoint_address_target_kind
- __meta_kubernetes_endpoint_address_target_name
separator: ;
regex: Node;(.*)
replacement: ${1}
target_label: node
- source_labels:
- __meta_kubernetes_endpoint_address_target_kind
- __meta_kubernetes_endpoint_address_target_name
separator: ;
regex: Pod;(.*)
replacement: ${1}
target_label: pod
- source_labels:
- __meta_kubernetes_namespace
target_label: namespace
- source_labels:
- __meta_kubernetes_service_name
target_label: service
- source_labels:
- __meta_kubernetes_pod_name
target_label: pod
- source_labels:
- __meta_kubernetes_pod_container_name
target_label: container
- action: drop
source_labels:
- __meta_kubernetes_pod_phase
regex: (Failed|Succeeded)
- source_labels:
- __meta_kubernetes_service_name
target_label: job
replacement: ${1}
- source_labels:
- __meta_kubernetes_service_label_jobLabel
target_label: job
regex: (.+)
replacement: ${1}
- target_label: endpoint
replacement: http-metrics
- source_labels:
- __address__
target_label: __tmp_hash
modulus: 1
action: hashmod
- source_labels:
- __tmp_hash
regex: $(SHARD)
action: keep
metric_relabel_configs: []
- job_name: serviceMonitor/aml-monitoring/aml-monitoring-kube-promet-kubelet/0
honor_labels: true
kubernetes_sd_configs:
- role: endpoints
namespaces:
names:
- kube-system
scheme: https
tls_config:
insecure_skip_verify: true
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
relabel_configs:
- source_labels:
- job
target_label: __tmp_prometheus_job_name
- action: keep
source_labels:
- __meta_kubernetes_service_label_app_kubernetes_io_name
- __meta_kubernetes_service_labelpresent_app_kubernetes_io_name
regex: (kubelet);true
- action: keep
source_labels:
- __meta_kubernetes_service_label_k8s_app
- __meta_kubernetes_service_labelpresent_k8s_app
regex: (kubelet);true
- action: keep
source_labels:
- __meta_kubernetes_endpoint_port_name
regex: https-metrics
- source_labels:
- __meta_kubernetes_endpoint_address_target_kind
- __meta_kubernetes_endpoint_address_target_name
separator: ;
regex: Node;(.*)
replacement: ${1}
target_label: node
- source_labels:
- __meta_kubernetes_endpoint_address_target_kind
- __meta_kubernetes_endpoint_address_target_name
separator: ;
regex: Pod;(.*)
replacement: ${1}
target_label: pod
- source_labels:
- __meta_kubernetes_namespace
target_label: namespace
- source_labels:
- __meta_kubernetes_service_name
target_label: service
- source_labels:
- __meta_kubernetes_pod_name
target_label: pod
- source_labels:
- __meta_kubernetes_pod_container_name
target_label: container
- action: drop
source_labels:
- __meta_kubernetes_pod_phase
regex: (Failed|Succeeded)
- source_labels:
- __meta_kubernetes_service_name
target_label: job
replacement: ${1}
- source_labels:
- __meta_kubernetes_service_label_k8s_app
target_label: job
regex: (.+)
replacement: ${1}
- target_label: endpoint
replacement: https-metrics
- source_labels:
- __metrics_path__
target_label: metrics_path
action: replace
- source_labels:
- __address__
target_label: __tmp_hash
modulus: 1
action: hashmod
- source_labels:
- __tmp_hash
regex: $(SHARD)
action: keep
metric_relabel_configs: []
- job_name: serviceMonitor/aml-monitoring/aml-monitoring-kube-promet-kubelet/1
honor_labels: true
kubernetes_sd_configs:
- role: endpoints
namespaces:
names:
- kube-system
metrics_path: /metrics/cadvisor
scheme: https
tls_config:
insecure_skip_verify: true
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
relabel_configs:
- source_labels:
- job
target_label: __tmp_prometheus_job_name
- action: keep
source_labels:
- __meta_kubernetes_service_label_app_kubernetes_io_name
- __meta_kubernetes_service_labelpresent_app_kubernetes_io_name
regex: (kubelet);true
- action: keep
source_labels:
- __meta_kubernetes_service_label_k8s_app
- __meta_kubernetes_service_labelpresent_k8s_app
regex: (kubelet);true
- action: keep
source_labels:
- __meta_kubernetes_endpoint_port_name
regex: https-metrics
- source_labels:
- __meta_kubernetes_endpoint_address_target_kind
- __meta_kubernetes_endpoint_address_target_name
separator: ;
regex: Node;(.*)
replacement: ${1}
target_label: node
- source_labels:
- __meta_kubernetes_endpoint_address_target_kind
- __meta_kubernetes_endpoint_address_target_name
separator: ;
regex: Pod;(.*)
replacement: ${1}
target_label: pod
- source_labels:
- __meta_kubernetes_namespace
target_label: namespace
- source_labels:
- __meta_kubernetes_service_name
target_label: service
- source_labels:
- __meta_kubernetes_pod_name
target_label: pod
- source_labels:
- __meta_kubernetes_pod_container_name
target_label: container
- action: drop
source_labels:
- __meta_kubernetes_pod_phase
regex: (Failed|Succeeded)
- source_labels:
- __meta_kubernetes_service_name
target_label: job
replacement: ${1}
- source_labels:
- __meta_kubernetes_service_label_k8s_app
target_label: job
regex: (.+)
replacement: ${1}
- target_label: endpoint
replacement: https-metrics
- source_labels:
- __metrics_path__
target_label: metrics_path
action: replace
- source_labels:
- __address__
target_label: __tmp_hash
modulus: 1
action: hashmod
- source_labels:
- __tmp_hash
regex: $(SHARD)
action: keep
metric_relabel_configs:
- source_labels:
- __name__
regex: container_cpu_(cfs_throttled_seconds_total|load_average_10s|system_seconds_total|user_seconds_total)
action: drop
- source_labels:
- __name__
regex: container_fs_(io_current|io_time_seconds_total|io_time_weighted_seconds_total|reads_merged_total|sector_reads_total|sector_writes_total|writes_merged_total)
action: drop
- source_labels:
- __name__
regex: container_memory_(mapped_file|swap)
action: drop
- source_labels:
- __name__
regex: container_(file_descriptors|tasks_state|threads_max)
action: drop
- source_labels:
- __name__
regex: container_spec.*
action: drop
- source_labels:
- id
- pod
regex: .+;
action: drop
- job_name: serviceMonitor/aml-monitoring/aml-monitoring-kube-promet-kubelet/2
honor_labels: true
kubernetes_sd_configs:
- role: endpoints
namespaces:
names:
- kube-system
metrics_path: /metrics/probes
scheme: https
tls_config:
insecure_skip_verify: true
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
relabel_configs:
- source_labels:
- job
target_label: __tmp_prometheus_job_name
- action: keep
source_labels:
- __meta_kubernetes_service_label_app_kubernetes_io_name
- __meta_kubernetes_service_labelpresent_app_kubernetes_io_name
regex: (kubelet);true
- action: keep
source_labels:
- __meta_kubernetes_service_label_k8s_app
- __meta_kubernetes_service_labelpresent_k8s_app
regex: (kubelet);true
- action: keep
source_labels:
- __meta_kubernetes_endpoint_port_name
regex: https-metrics
- source_labels:
- __meta_kubernetes_endpoint_address_target_kind
- __meta_kubernetes_endpoint_address_target_name
separator: ;
regex: Node;(.*)
replacement: ${1}
target_label: node
- source_labels:
- __meta_kubernetes_endpoint_address_target_kind
- __meta_kubernetes_endpoint_address_target_name
separator: ;
regex: Pod;(.*)
replacement: ${1}
target_label: pod
- source_labels:
- __meta_kubernetes_namespace
target_label: namespace
- source_labels:
- __meta_kubernetes_service_name
target_label: service
- source_labels:
- __meta_kubernetes_pod_name
target_label: pod
- source_labels:
- __meta_kubernetes_pod_container_name
target_label: container
- action: drop
source_labels:
- __meta_kubernetes_pod_phase
regex: (Failed|Succeeded)
- source_labels:
- __meta_kubernetes_service_name
target_label: job
replacement: ${1}
- source_labels:
- __meta_kubernetes_service_label_k8s_app
target_label: job
regex: (.+)
replacement: ${1}
- target_label: endpoint
replacement: https-metrics
- source_labels:
- __metrics_path__
target_label: metrics_path
action: replace
- source_labels:
- __address__
target_label: __tmp_hash
modulus: 1
action: hashmod
- source_labels:
- __tmp_hash
regex: $(SHARD)
action: keep
metric_relabel_configs: []
- job_name: serviceMonitor/aml-monitoring/aml-monitoring-kube-promet-operator/0
honor_labels: true
kubernetes_sd_configs:
- role: endpoints
namespaces:
names:
- aml-monitoring
scheme: https
tls_config:
insecure_skip_verify: false
ca_file: /etc/prometheus/certs/secret_aml-monitoring_aml-monitoring-kube-promet-admission_ca
server_name: aml-monitoring-kube-promet-operator
relabel_configs:
- source_labels:
- job
target_label: __tmp_prometheus_job_name
- action: keep
source_labels:
- __meta_kubernetes_service_label_app
- __meta_kubernetes_service_labelpresent_app
regex: (kube-prometheus-stack-operator);true
- action: keep
source_labels:
- __meta_kubernetes_service_label_release
- __meta_kubernetes_service_labelpresent_release
regex: (aml-monitoring);true
- action: keep
source_labels:
- __meta_kubernetes_endpoint_port_name
regex: https
- source_labels:
- __meta_kubernetes_endpoint_address_target_kind
- __meta_kubernetes_endpoint_address_target_name
separator: ;
regex: Node;(.*)
replacement: ${1}
target_label: node
- source_labels:
- __meta_kubernetes_endpoint_address_target_kind
- __meta_kubernetes_endpoint_address_target_name
separator: ;
regex: Pod;(.*)
replacement: ${1}
target_label: pod
- source_labels:
- __meta_kubernetes_namespace
target_label: namespace
- source_labels:
- __meta_kubernetes_service_name
target_label: service
- source_labels:
- __meta_kubernetes_pod_name
target_label: pod
- source_labels:
- __meta_kubernetes_pod_container_name
target_label: container
- action: drop
source_labels:
- __meta_kubernetes_pod_phase
regex: (Failed|Succeeded)
- source_labels:
- __meta_kubernetes_service_name
target_label: job
replacement: ${1}
- target_label: endpoint
replacement: https
- source_labels:
- __address__
target_label: __tmp_hash
modulus: 1
action: hashmod
- source_labels:
- __tmp_hash
regex: $(SHARD)
action: keep
metric_relabel_configs: []
- job_name: serviceMonitor/aml-monitoring/aml-monitoring-kube-promet-prometheus/0
honor_labels: false
kubernetes_sd_configs:
- role: endpoints
namespaces:
names:
- aml-monitoring
metrics_path: /metrics
relabel_configs:
- source_labels:
- job
target_label: __tmp_prometheus_job_name
- action: keep
source_labels:
- __meta_kubernetes_service_label_app
- __meta_kubernetes_service_labelpresent_app
regex: (kube-prometheus-stack-prometheus);true
- action: keep
source_labels:
- __meta_kubernetes_service_label_release
- __meta_kubernetes_service_labelpresent_release
regex: (aml-monitoring);true
- action: keep
source_labels:
- __meta_kubernetes_service_label_self_monitor
- __meta_kubernetes_service_labelpresent_self_monitor
regex: (true);true
- action: keep
source_labels:
- __meta_kubernetes_endpoint_port_name
regex: http-web
- source_labels:
- __meta_kubernetes_endpoint_address_target_kind
- __meta_kubernetes_endpoint_address_target_name
separator: ;
regex: Node;(.*)
replacement: ${1}
target_label: node
- source_labels:
- __meta_kubernetes_endpoint_address_target_kind
- __meta_kubernetes_endpoint_address_target_name
separator: ;
regex: Pod;(.*)
replacement: ${1}
target_label: pod
- source_labels:
- __meta_kubernetes_namespace
target_label: namespace
- source_labels:
- __meta_kubernetes_service_name
target_label: service
- source_labels:
- __meta_kubernetes_pod_name
target_label: pod
- source_labels:
- __meta_kubernetes_pod_container_name
target_label: container
- action: drop
source_labels:
- __meta_kubernetes_pod_phase
regex: (Failed|Succeeded)
- source_labels:
- __meta_kubernetes_service_name
target_label: job
replacement: ${1}
- target_label: endpoint
replacement: http-web
- source_labels:
- __address__
target_label: __tmp_hash
modulus: 1
action: hashmod
- source_labels:
- __tmp_hash
regex: $(SHARD)
action: keep
metric_relabel_configs: []
- job_name: serviceMonitor/aml-monitoring/aml-monitoring-kube-promet-prometheus/1
honor_labels: false
kubernetes_sd_configs:
- role: endpoints
namespaces:
names:
- aml-monitoring
metrics_path: /metrics
relabel_configs:
- source_labels:
- job
target_label: __tmp_prometheus_job_name
- action: keep
source_labels:
- __meta_kubernetes_service_label_app
- __meta_kubernetes_service_labelpresent_app
regex: (kube-prometheus-stack-prometheus);true
- action: keep
source_labels:
- __meta_kubernetes_service_label_release
- __meta_kubernetes_service_labelpresent_release
regex: (aml-monitoring);true
- action: keep
source_labels:
- __meta_kubernetes_service_label_self_monitor
- __meta_kubernetes_service_labelpresent_self_monitor
regex: (true);true
- action: keep
source_labels:
- __meta_kubernetes_endpoint_port_name
regex: reloader-web
- source_labels:
- __meta_kubernetes_endpoint_address_target_kind
- __meta_kubernetes_endpoint_address_target_name
separator: ;
regex: Node;(.*)
replacement: ${1}
target_label: node
- source_labels:
- __meta_kubernetes_endpoint_address_target_kind
- __meta_kubernetes_endpoint_address_target_name
separator: ;
regex: Pod;(.*)
replacement: ${1}
target_label: pod
- source_labels:
- __meta_kubernetes_namespace
target_label: namespace
- source_labels:
- __meta_kubernetes_service_name
target_label: service
- source_labels:
- __meta_kubernetes_pod_name
target_label: pod
- source_labels:
- __meta_kubernetes_pod_container_name
target_label: container
- action: drop
source_labels:
- __meta_kubernetes_pod_phase
regex: (Failed|Succeeded)
- source_labels:
- __meta_kubernetes_service_name
target_label: job
replacement: ${1}
- target_label: endpoint
replacement: reloader-web
- source_labels:
- __address__
target_label: __tmp_hash
modulus: 1
action: hashmod
- source_labels:
- __tmp_hash
regex: $(SHARD)
action: keep
metric_relabel_configs: []
- job_name: serviceMonitor/aml-monitoring/aml-monitoring-kube-state-metrics/0
honor_labels: true
kubernetes_sd_configs:
- role: endpoints
namespaces:
names:
- aml-monitoring
relabel_configs:
- source_labels:
- job
target_label: __tmp_prometheus_job_name
- action: keep
source_labels:
- __meta_kubernetes_service_label_app_kubernetes_io_instance
- __meta_kubernetes_service_labelpresent_app_kubernetes_io_instance
regex: (aml-monitoring);true
- action: keep
source_labels:
- __meta_kubernetes_service_label_app_kubernetes_io_name
- __meta_kubernetes_service_labelpresent_app_kubernetes_io_name
regex: (kube-state-metrics);true
- action: keep
source_labels:
- __meta_kubernetes_endpoint_port_name
regex: http
- source_labels:
- __meta_kubernetes_endpoint_address_target_kind
- __meta_kubernetes_endpoint_address_target_name
separator: ;
regex: Node;(.*)
replacement: ${1}
target_label: node
- source_labels:
- __meta_kubernetes_endpoint_address_target_kind
- __meta_kubernetes_endpoint_address_target_name
separator: ;
regex: Pod;(.*)
replacement: ${1}
target_label: pod
- source_labels:
- __meta_kubernetes_namespace
target_label: namespace
- source_labels:
- __meta_kubernetes_service_name
target_label: service
- source_labels:
- __meta_kubernetes_pod_name
target_label: pod
- source_labels:
- __meta_kubernetes_pod_container_name
target_label: container
- action: drop
source_labels:
- __meta_kubernetes_pod_phase
regex: (Failed|Succeeded)
- source_labels:
- __meta_kubernetes_service_name
target_label: job
replacement: ${1}
- source_labels:
- __meta_kubernetes_service_label_app_kubernetes_io_name
target_label: job
regex: (.+)
replacement: ${1}
- target_label: endpoint
replacement: http
- source_labels:
- __address__
target_label: __tmp_hash
modulus: 1
action: hashmod
- source_labels:
- __tmp_hash
regex: $(SHARD)
action: keep
metric_relabel_configs: []
- job_name: serviceMonitor/aml-monitoring/aml-monitoring-loki/0
honor_labels: false
kubernetes_sd_configs:
- role: endpoints
namespaces:
names:
- aml-monitoring
scrape_interval: 15s
metrics_path: /metrics
scheme: http
relabel_configs:
- source_labels:
- job
target_label: __tmp_prometheus_job_name
- action: keep
source_labels:
- __meta_kubernetes_service_label_app_kubernetes_io_instance
- __meta_kubernetes_service_labelpresent_app_kubernetes_io_instance
regex: (aml-monitoring);true
- action: keep
source_labels:
- __meta_kubernetes_service_label_app_kubernetes_io_name
- __meta_kubernetes_service_labelpresent_app_kubernetes_io_name
regex: (loki);true
- action: drop
source_labels:
- __meta_kubernetes_service_label_prometheus_io_service_monitor
- __meta_kubernetes_service_labelpresent_prometheus_io_service_monitor
regex: (false);true
- action: keep
source_labels:
- __meta_kubernetes_endpoint_port_name
regex: http-metrics
- source_labels:
- __meta_kubernetes_endpoint_address_target_kind
- __meta_kubernetes_endpoint_address_target_name
separator: ;
regex: Node;(.*)
replacement: ${1}
target_label: node
- source_labels:
- __meta_kubernetes_endpoint_address_target_kind
- __meta_kubernetes_endpoint_address_target_name
separator: ;
regex: Pod;(.*)
replacement: ${1}
target_label: pod
- source_labels:
- __meta_kubernetes_namespace
target_label: namespace
- source_labels:
- __meta_kubernetes_service_name
target_label: service
- source_labels:
- __meta_kubernetes_pod_name
target_label: pod
- source_labels:
- __meta_kubernetes_pod_container_name
target_label: container
- action: drop
source_labels:
- __meta_kubernetes_pod_phase
regex: (Failed|Succeeded)
- source_labels:
- __meta_kubernetes_service_name
target_label: job
replacement: ${1}
- target_label: endpoint
replacement: http-metrics
- source_labels:
- job
target_label: job
replacement: aml-monitoring/$1
action: replace
- target_label: cluster
replacement: aml-monitoring-loki
action: replace
- source_labels:
- __address__
target_label: __tmp_hash
modulus: 1
action: hashmod
- source_labels:
- __tmp_hash
regex: $(SHARD)
action: keep
metric_relabel_configs: []
- job_name: serviceMonitor/aml-monitoring/aml-monitoring-prometheus-node-exporter/0
honor_labels: false
kubernetes_sd_configs:
- role: endpoints
namespaces:
names:
- aml-monitoring
attach_metadata:
node: false
scheme: http
relabel_configs:
- source_labels:
- job
target_label: __tmp_prometheus_job_name
- action: keep
source_labels:
- __meta_kubernetes_service_label_app_kubernetes_io_instance
- __meta_kubernetes_service_labelpresent_app_kubernetes_io_instance
regex: (aml-monitoring);true
- action: keep
source_labels:
- __meta_kubernetes_service_label_app_kubernetes_io_name
- __meta_kubernetes_service_labelpresent_app_kubernetes_io_name
regex: (prometheus-node-exporter);true
- action: keep
source_labels:
- __meta_kubernetes_endpoint_port_name
regex: http-metrics
- source_labels:
- __meta_kubernetes_endpoint_address_target_kind
- __meta_kubernetes_endpoint_address_target_name
separator: ;
regex: Node;(.*)
replacement: ${1}
target_label: node
- source_labels:
- __meta_kubernetes_endpoint_address_target_kind
- __meta_kubernetes_endpoint_address_target_name
separator: ;
regex: Pod;(.*)
replacement: ${1}
target_label: pod
- source_labels:
- __meta_kubernetes_namespace
target_label: namespace
- source_labels:
- __meta_kubernetes_service_name
target_label: service
- source_labels:
- __meta_kubernetes_pod_name
target_label: pod
- source_labels:
- __meta_kubernetes_pod_container_name
target_label: container
- action: drop
source_labels:
- __meta_kubernetes_pod_phase
regex: (Failed|Succeeded)
- source_labels:
- __meta_kubernetes_service_name
target_label: job
replacement: ${1}
- source_labels:
- __meta_kubernetes_service_label_jobLabel
target_label: job
regex: (.+)
replacement: ${1}
- target_label: endpoint
replacement: http-metrics
- source_labels:
- __address__
target_label: __tmp_hash
modulus: 1
action: hashmod
- source_labels:
- __tmp_hash
regex: $(SHARD)
action: keep
metric_relabel_configs: []
storage:
tsdb:
out_of_order_time_window: 0s
alerting:
alert_relabel_configs:
- action: labeldrop
regex: prometheus_replica
alertmanagers:
- path_prefix: /
scheme: http
kubernetes_sd_configs:
- role: endpoints
namespaces:
names:
- aml-monitoring
api_version: v2
relabel_configs:
- action: keep
source_labels:
- __meta_kubernetes_service_name
regex: aml-monitoring-kube-promet-alertmanager
- action: keep
source_labels:
- __meta_kubernetes_endpoint_port_name
regex: http-web
Well, so this is interesting.
This is what I'm getting on my Amazon EKS cluster:
kubectl get podMonitor -n aml -o yaml
apiVersion: v1
items: []
kind: List
metadata:
resourceVersion: ""
kubectl get serviceMonitor -n aml -o yaml
apiVersion: v1
items: []
kind: List
metadata:
resourceVersion: ""
Whereas this is what I get on my Azure AKS cluster, running exactly the same applications in the aml namespace, and the exact same Prometheus versions and configuration:
kubectl get podMonitor -n aml -o yaml
apiVersion: v1
items:
- apiVersion: monitoring.coreos.com/v1
kind: PodMonitor
metadata:
annotations:
kubectl.kubernetes.io/last-applied-configuration: |
{"apiVersion":"monitoring.coreos.com/v1","kind":"PodMonitor","metadata":{"annotations":{},"labels":{"argocd.argoproj.io/instance":"devx-prod-aml"},"name":"kafka-cluster-operator-metrics","namespace":"aml"},"spec":{"podMetricsEndpoints":[{"path":"/metrics","port":"http"}],"selector":{"matchLabels":{"strimzi.io/kind":"cluster-operator"}}}}
creationTimestamp: "2023-07-04T16:23:23Z"
generation: 1
labels:
argocd.argoproj.io/instance: devx-prod-aml
name: kafka-cluster-operator-metrics
namespace: aml
resourceVersion: "34474"
uid: 69e5adeb-d69d-4b58-b45a-06bf514ab7bf
spec:
podMetricsEndpoints:
- path: /metrics
port: http
selector:
matchLabels:
strimzi.io/kind: cluster-operator
- apiVersion: monitoring.coreos.com/v1
kind: PodMonitor
metadata:
annotations:
kubectl.kubernetes.io/last-applied-configuration: |
{"apiVersion":"monitoring.coreos.com/v1","kind":"PodMonitor","metadata":{"annotations":{},"labels":{"argocd.argoproj.io/instance":"devx-prod-aml"},"name":"kafka-entity-operator-metrics","namespace":"aml"},"spec":{"podMetricsEndpoints":[{"path":"/metrics","port":"healthcheck"}],"selector":{"matchLabels":{"app.kubernetes.io/name":"entity-operator"}}}}
creationTimestamp: "2023-07-04T16:23:23Z"
generation: 1
labels:
argocd.argoproj.io/instance: devx-prod-aml
name: kafka-entity-operator-metrics
namespace: aml
resourceVersion: "34473"
uid: b74d8a03-22f3-47e0-94f3-fd50fb9bfe64
spec:
podMetricsEndpoints:
- path: /metrics
port: healthcheck
selector:
matchLabels:
app.kubernetes.io/name: entity-operator
kubectl get serviceMonitor -n aml -o yaml
apiVersion: v1
items:
- apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
annotations:
kubectl.kubernetes.io/last-applied-configuration: |
{"apiVersion":"monitoring.coreos.com/v1","kind":"ServiceMonitor","metadata":{"annotations":{},"labels":{"app":"nifi","argocd.argoproj.io/instance":"devx-prod-aml","chart":"nifi-1.1.42","heritage":"Helm","release":"devx-prod-aml"},"name":"nifi","namespace":"aml"},"spec":{"endpoints":[{"honorLabels":true,"port":"metrics"}],"namespaceSelector":{"matchNames":["aml"]},"selector":{"matchLabels":{"app":"nifi","release":"devx-prod-aml"}}}}
creationTimestamp: "2023-07-04T16:23:23Z"
generation: 1
labels:
app: nifi
argocd.argoproj.io/instance: devx-prod-aml
chart: nifi-1.1.42
heritage: Helm
release: devx-prod-aml
name: nifi
namespace: aml
resourceVersion: "34480"
uid: 59432a37-5209-483f-8829-b7efa076f273
spec:
endpoints:
- honorLabels: true
port: metrics
namespaceSelector:
matchNames:
- aml
selector:
matchLabels:
app: nifi
release: devx-prod-aml
Here are the APIs:
Why would this be different on EKS (not working) vs AKS (working)?
I cannot find any error messages or failures anywhere!
I guess we have the similar problem, yes (https://github.com/prometheus-community/helm-charts/issues/3487)
I tested it with local kind cluster as well as with k3s on hetzner cloud. It can discover ServiceMonitor
when in the same namespace as the kube-prometheus-stack
, the default
or the kube-system
namespace. But when I use a own namespace then it is not found.
In my case my application namespace is pixolution
The relevant part in my values.yaml looks like this:
prometheus:
enabled: true
prometheusSpec:
nodeSelector:
node-type: app
serviceMonitorSelectorNilUsesHelmValues: false
serviceMonitorSelector:
matchLabels:
release: kube-prometheus-stack
serviceMonitorNamespaceSelector:
matchExpressions:
- key: name
operator: In
values:
- monitoring
- pixolution
- kube-system
My ServiceMonitor
has the proper label (node-type: app
) and is deployed to pixolution
namespace, but it is not discovered due to permission issues. I tried to solve this by creating Role
and RoleBinding
in pixolution
namespace for the existing ServiceAccount
kube-prometheus-stack-operator, but it did not work. Then I gave up
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
labels:
app: kube-prometheus-stack-prometheus
app.kubernetes.io/instance: kube-prometheus-stack
app.kubernetes.io/managed-by: Helm
app.kubernetes.io/part-of: kube-prometheus-stack
name: prometheus-kafka-clusterrole
rules:
- verbs:
- get
- list
- watch
apiGroups:
- ''
resources:
- nodes
- nodes/metrics
- services
- endpoints
- pods
- verbs:
- get
- list
- watch
apiGroups:
- networking.k8s.io
resources:
- ingresses
- verbs:
- get
- list
- watch
nonResourceURLs:
- /metrics
- /metrics/cadvisor
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
labels:
app: kube-prometheus-stack-prometheus
app.kubernetes.io/instance: kube-prometheus-stack
app.kubernetes.io/managed-by: Helm
app.kubernetes.io/part-of: kube-prometheus-stack
name: prometheus-kafka
subjects:
- kind: ServiceAccount
name: kube-prometheus-stack-operator
namespace: monitoring
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: prometheus-kafka-clusterrole
see https://github.com/pixolution/kube-prometheus-stack-plus-kafka for the full example code
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Any further update will cause the issue/pull request to no longer be considered stale. Thank you for your contributions.
Managed to find the root cause.
If the applications are installed before the pod/servicemonitor CRDs are installed, then they do not create pod/servicemonitors and therefore there is nothing for Prometheus to discover.
Simply deleting the pods doesn't fix it, you have to install Prometheus and then re-deploy the applications with a helm upgrade
or similar.
After this, the pod/servicemonitors will be created and Prometheus can discover them.
Describe the bug a clear and concise description of what the bug is.
With kube-prometheus-stack 47.0.0 it seems discovery of resources in namespaces other than default, kube-system and [install namespace] doesn't work anymore on EKS v1.24.10, even with these options set:
I also tried this variation which is an inverse match to select all namespaces:
...but Prometheus seems to ignore those options and only discover resources in the default, kube-system and [install namespace].
Helm chart:
Prometheus config:
Prometheus logs:
What's your helm version?
version.BuildInfo{Version:"v3.12.1", GitCommit:"f32a527a060157990e2aa86bf45010dfb3cc8b8d", GitTreeState:"clean", GoVersion:"go1.20.4"}
What's your kubectl version?
Client Version: version.Info{Major:"1", Minor:"27", GitVersion:"v1.27.2", GitCommit:"7f6f68fdabc4df88cfea2dcf9a19b2b830f1e647", GitTreeState:"clean", BuildDate:"2023-05-17T14:20:07Z", GoVersion:"go1.20.4", Compiler:"gc", Platform:"linux/amd64"} Kustomize Version: v5.0.1
Which chart?
kube-promethus-stack
What's the chart version?
47.0.0
What happened?
Prometheus is ignoring the configuration to scan all namespaces in the EKS cluster
What you expected to happen?
I'm expecting Prometheus to recognise these config options:
How to reproduce it?
No response
Enter the changed values of values.yaml?
No response
Enter the command that you execute and failing/misfunctioning.
helm upgrade --install -n aml-monitoring aml-monitoring ./ -f values.yaml
Anything else we need to know?
No response