prometheus-community / helm-charts

Prometheus community Helm charts
Apache License 2.0
4.98k stars 4.99k forks source link

[kube-prometheus-stack] Discovery of additional namespaces not working on EKS with chart 47.0.0 #3549

Open johnswarbrick-napier opened 1 year ago

johnswarbrick-napier commented 1 year ago

Describe the bug a clear and concise description of what the bug is.

With kube-prometheus-stack 47.0.0 it seems discovery of resources in namespaces other than default, kube-system and [install namespace] doesn't work anymore on EKS v1.24.10, even with these options set:

prometheus:
    prometheusSpec:
      podMonitorSelectorNilUsesHelmValues: false
      serviceMonitorSelectorNilUsesHelmValues: false

I also tried this variation which is an inverse match to select all namespaces:

prometheus:
    prometheusSpec:
      podMonitorSelectorNilUsesHelmValues: false
      podMonitorNamespaceSelector:
        matchExpressions:
        - key: "non-existent-label"
          operator: "DoesNotExist"
      serviceMonitorSelectorNilUsesHelmValues: false
      serviceMonitorNamespaceSelector:
        matchExpressions:
        - key: "non-existent-label"
          operator: "DoesNotExist"

...but Prometheus seems to ignore those options and only discover resources in the default, kube-system and [install namespace].

Helm chart:

global:
  rbac:
    create: true

kube-prometheus-stack:
  cleanPrometheusOperatorObjectNames: true
  grafana:
    enabled: false
  prometheusOperator:
    admissionWebhooks:
      enabled: false
      certManager:
        enabled: false
      patch:
        enabled: false
    tls:
      enabled: false
    tlsProxy:
      enabled: false
  prometheus:
    prometheusSpec:
      podMonitorSelectorNilUsesHelmValues: false
      podMonitorNamespaceSelector:
        matchExpressions:
        - key: "non-existent-label"
          operator: "DoesNotExist"
      serviceMonitorSelectorNilUsesHelmValues: false
      serviceMonitorNamespaceSelector:
        matchExpressions:
        - key: "non-existent-label"
          operator: "DoesNotExist"
      storageSpec:
        volumeClaimTemplate:
          spec:
            #accessModes:
            #  - ReadWriteOnce
            resources:
              requests:
                storage: 50Gi
            storageClassName: gp2
      retention: 14d
      additionalScrapeConfigs: []
  kubeControllerManager:
    enabled: false
    service:
      enabled: true
      selector:
        k8s-app : kube-controller-manager
  kubeProxy:
    enabled: false
    service:
      enabled: false
      selector:
        # make sure proxy pod labels are added here
        k8s-app: kube-proxy
  kubeScheduler:
    enabled: false
    service:
      enabled: false
      selector:
        # make sure scheduler pod labels are added here
        k8s-app: kube-scheduler

Prometheus config:

global:
  scrape_interval: 30s
  scrape_timeout: 10s
  evaluation_interval: 30s
  external_labels:
    prometheus: aml-monitoring/aml-monitoring-kube-promet
    prometheus_replica: prometheus-aml-monitoring-kube-promet-0
alerting:
  alert_relabel_configs:
  - separator: ;
    regex: prometheus_replica
    replacement: $1
    action: labeldrop
  alertmanagers:
  - follow_redirects: true
    enable_http2: true
    scheme: http
    path_prefix: /
    timeout: 10s
    api_version: v2
    relabel_configs:
    - source_labels: [__meta_kubernetes_service_name]
      separator: ;
      regex: aml-monitoring-kube-promet-alertmanager
      replacement: $1
      action: keep
    - source_labels: [__meta_kubernetes_endpoint_port_name]
      separator: ;
      regex: http-web
      replacement: $1
      action: keep
    kubernetes_sd_configs:
    - role: endpoints
      kubeconfig_file: ""
      follow_redirects: true
      enable_http2: true
      namespaces:
        own_namespace: false
        names:
        - aml-monitoring
rule_files:
- /etc/prometheus/rules/prometheus-aml-monitoring-kube-promet-rulefiles-0/*.yaml
scrape_configs:
- job_name: serviceMonitor/aml-monitoring/aml-monitoring-kube-promet-alertmanager/0
  honor_timestamps: true
  scrape_interval: 30s
  scrape_timeout: 10s
  metrics_path: /metrics
  scheme: http
  follow_redirects: true
  enable_http2: true
  relabel_configs:
  - source_labels: [job]
    separator: ;
    regex: (.*)
    target_label: __tmp_prometheus_job_name
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_service_label_app, __meta_kubernetes_service_labelpresent_app]
    separator: ;
    regex: (kube-prometheus-stack-alertmanager);true
    replacement: $1
    action: keep
  - source_labels: [__meta_kubernetes_service_label_release, __meta_kubernetes_service_labelpresent_release]
    separator: ;
    regex: (aml-monitoring);true
    replacement: $1
    action: keep
  - source_labels: [__meta_kubernetes_service_label_self_monitor, __meta_kubernetes_service_labelpresent_self_monitor]
    separator: ;
    regex: (true);true
    replacement: $1
    action: keep
  - source_labels: [__meta_kubernetes_endpoint_port_name]
    separator: ;
    regex: http-web
    replacement: $1
    action: keep
  - source_labels: [__meta_kubernetes_endpoint_address_target_kind, __meta_kubernetes_endpoint_address_target_name]
    separator: ;
    regex: Node;(.*)
    target_label: node
    replacement: ${1}
    action: replace
  - source_labels: [__meta_kubernetes_endpoint_address_target_kind, __meta_kubernetes_endpoint_address_target_name]
    separator: ;
    regex: Pod;(.*)
    target_label: pod
    replacement: ${1}
    action: replace
  - source_labels: [__meta_kubernetes_namespace]
    separator: ;
    regex: (.*)
    target_label: namespace
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_service_name]
    separator: ;
    regex: (.*)
    target_label: service
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_pod_name]
    separator: ;
    regex: (.*)
    target_label: pod
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_pod_container_name]
    separator: ;
    regex: (.*)
    target_label: container
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_pod_phase]
    separator: ;
    regex: (Failed|Succeeded)
    replacement: $1
    action: drop
  - source_labels: [__meta_kubernetes_service_name]
    separator: ;
    regex: (.*)
    target_label: job
    replacement: ${1}
    action: replace
  - separator: ;
    regex: (.*)
    target_label: endpoint
    replacement: http-web
    action: replace
  - source_labels: [__address__]
    separator: ;
    regex: (.*)
    modulus: 1
    target_label: __tmp_hash
    replacement: $1
    action: hashmod
  - source_labels: [__tmp_hash]
    separator: ;
    regex: "0"
    replacement: $1
    action: keep
  kubernetes_sd_configs:
  - role: endpoints
    kubeconfig_file: ""
    follow_redirects: true
    enable_http2: true
    namespaces:
      own_namespace: false
      names:
      - aml-monitoring
- job_name: serviceMonitor/aml-monitoring/aml-monitoring-kube-promet-apiserver/0
  honor_timestamps: true
  scrape_interval: 30s
  scrape_timeout: 10s
  metrics_path: /metrics
  scheme: https
  authorization:
    type: Bearer
    credentials_file: /var/run/secrets/kubernetes.io/serviceaccount/token
  tls_config:
    ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
    server_name: kubernetes
    insecure_skip_verify: false
  follow_redirects: true
  enable_http2: true
  relabel_configs:
  - source_labels: [job]
    separator: ;
    regex: (.*)
    target_label: __tmp_prometheus_job_name
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_service_label_component, __meta_kubernetes_service_labelpresent_component]
    separator: ;
    regex: (apiserver);true
    replacement: $1
    action: keep
  - source_labels: [__meta_kubernetes_service_label_provider, __meta_kubernetes_service_labelpresent_provider]
    separator: ;
    regex: (kubernetes);true
    replacement: $1
    action: keep
  - source_labels: [__meta_kubernetes_endpoint_port_name]
    separator: ;
    regex: https
    replacement: $1
    action: keep
  - source_labels: [__meta_kubernetes_endpoint_address_target_kind, __meta_kubernetes_endpoint_address_target_name]
    separator: ;
    regex: Node;(.*)
    target_label: node
    replacement: ${1}
    action: replace
  - source_labels: [__meta_kubernetes_endpoint_address_target_kind, __meta_kubernetes_endpoint_address_target_name]
    separator: ;
    regex: Pod;(.*)
    target_label: pod
    replacement: ${1}
    action: replace
  - source_labels: [__meta_kubernetes_namespace]
    separator: ;
    regex: (.*)
    target_label: namespace
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_service_name]
    separator: ;
    regex: (.*)
    target_label: service
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_pod_name]
    separator: ;
    regex: (.*)
    target_label: pod
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_pod_container_name]
    separator: ;
    regex: (.*)
    target_label: container
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_pod_phase]
    separator: ;
    regex: (Failed|Succeeded)
    replacement: $1
    action: drop
  - source_labels: [__meta_kubernetes_service_name]
    separator: ;
    regex: (.*)
    target_label: job
    replacement: ${1}
    action: replace
  - source_labels: [__meta_kubernetes_service_label_component]
    separator: ;
    regex: (.+)
    target_label: job
    replacement: ${1}
    action: replace
  - separator: ;
    regex: (.*)
    target_label: endpoint
    replacement: https
    action: replace
  - source_labels: [__address__]
    separator: ;
    regex: (.*)
    modulus: 1
    target_label: __tmp_hash
    replacement: $1
    action: hashmod
  - source_labels: [__tmp_hash]
    separator: ;
    regex: "0"
    replacement: $1
    action: keep
  metric_relabel_configs:
  - source_labels: [__name__, le]
    separator: ;
    regex: apiserver_request_duration_seconds_bucket;(0.15|0.2|0.3|0.35|0.4|0.45|0.6|0.7|0.8|0.9|1.25|1.5|1.75|2|3|3.5|4|4.5|6|7|8|9|15|25|40|50)
    replacement: $1
    action: drop
  kubernetes_sd_configs:
  - role: endpoints
    kubeconfig_file: ""
    follow_redirects: true
    enable_http2: true
    namespaces:
      own_namespace: false
      names:
      - default
- job_name: serviceMonitor/aml-monitoring/aml-monitoring-kube-promet-coredns/0
  honor_timestamps: true
  scrape_interval: 30s
  scrape_timeout: 10s
  metrics_path: /metrics
  scheme: http
  authorization:
    type: Bearer
    credentials_file: /var/run/secrets/kubernetes.io/serviceaccount/token
  follow_redirects: true
  enable_http2: true
  relabel_configs:
  - source_labels: [job]
    separator: ;
    regex: (.*)
    target_label: __tmp_prometheus_job_name
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_service_label_app, __meta_kubernetes_service_labelpresent_app]
    separator: ;
    regex: (kube-prometheus-stack-coredns);true
    replacement: $1
    action: keep
  - source_labels: [__meta_kubernetes_service_label_release, __meta_kubernetes_service_labelpresent_release]
    separator: ;
    regex: (aml-monitoring);true
    replacement: $1
    action: keep
  - source_labels: [__meta_kubernetes_endpoint_port_name]
    separator: ;
    regex: http-metrics
    replacement: $1
    action: keep
  - source_labels: [__meta_kubernetes_endpoint_address_target_kind, __meta_kubernetes_endpoint_address_target_name]
    separator: ;
    regex: Node;(.*)
    target_label: node
    replacement: ${1}
    action: replace
  - source_labels: [__meta_kubernetes_endpoint_address_target_kind, __meta_kubernetes_endpoint_address_target_name]
    separator: ;
    regex: Pod;(.*)
    target_label: pod
    replacement: ${1}
    action: replace
  - source_labels: [__meta_kubernetes_namespace]
    separator: ;
    regex: (.*)
    target_label: namespace
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_service_name]
    separator: ;
    regex: (.*)
    target_label: service
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_pod_name]
    separator: ;
    regex: (.*)
    target_label: pod
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_pod_container_name]
    separator: ;
    regex: (.*)
    target_label: container
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_pod_phase]
    separator: ;
    regex: (Failed|Succeeded)
    replacement: $1
    action: drop
  - source_labels: [__meta_kubernetes_service_name]
    separator: ;
    regex: (.*)
    target_label: job
    replacement: ${1}
    action: replace
  - source_labels: [__meta_kubernetes_service_label_jobLabel]
    separator: ;
    regex: (.+)
    target_label: job
    replacement: ${1}
    action: replace
  - separator: ;
    regex: (.*)
    target_label: endpoint
    replacement: http-metrics
    action: replace
  - source_labels: [__address__]
    separator: ;
    regex: (.*)
    modulus: 1
    target_label: __tmp_hash
    replacement: $1
    action: hashmod
  - source_labels: [__tmp_hash]
    separator: ;
    regex: "0"
    replacement: $1
    action: keep
  kubernetes_sd_configs:
  - role: endpoints
    kubeconfig_file: ""
    follow_redirects: true
    enable_http2: true
    namespaces:
      own_namespace: false
      names:
      - kube-system
- job_name: serviceMonitor/aml-monitoring/aml-monitoring-kube-promet-kube-etcd/0
  honor_timestamps: true
  scrape_interval: 30s
  scrape_timeout: 10s
  metrics_path: /metrics
  scheme: http
  authorization:
    type: Bearer
    credentials_file: /var/run/secrets/kubernetes.io/serviceaccount/token
  follow_redirects: true
  enable_http2: true
  relabel_configs:
  - source_labels: [job]
    separator: ;
    regex: (.*)
    target_label: __tmp_prometheus_job_name
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_service_label_app, __meta_kubernetes_service_labelpresent_app]
    separator: ;
    regex: (kube-prometheus-stack-kube-etcd);true
    replacement: $1
    action: keep
  - source_labels: [__meta_kubernetes_service_label_release, __meta_kubernetes_service_labelpresent_release]
    separator: ;
    regex: (aml-monitoring);true
    replacement: $1
    action: keep
  - source_labels: [__meta_kubernetes_endpoint_port_name]
    separator: ;
    regex: http-metrics
    replacement: $1
    action: keep
  - source_labels: [__meta_kubernetes_endpoint_address_target_kind, __meta_kubernetes_endpoint_address_target_name]
    separator: ;
    regex: Node;(.*)
    target_label: node
    replacement: ${1}
    action: replace
  - source_labels: [__meta_kubernetes_endpoint_address_target_kind, __meta_kubernetes_endpoint_address_target_name]
    separator: ;
    regex: Pod;(.*)
    target_label: pod
    replacement: ${1}
    action: replace
  - source_labels: [__meta_kubernetes_namespace]
    separator: ;
    regex: (.*)
    target_label: namespace
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_service_name]
    separator: ;
    regex: (.*)
    target_label: service
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_pod_name]
    separator: ;
    regex: (.*)
    target_label: pod
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_pod_container_name]
    separator: ;
    regex: (.*)
    target_label: container
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_pod_phase]
    separator: ;
    regex: (Failed|Succeeded)
    replacement: $1
    action: drop
  - source_labels: [__meta_kubernetes_service_name]
    separator: ;
    regex: (.*)
    target_label: job
    replacement: ${1}
    action: replace
  - source_labels: [__meta_kubernetes_service_label_jobLabel]
    separator: ;
    regex: (.+)
    target_label: job
    replacement: ${1}
    action: replace
  - separator: ;
    regex: (.*)
    target_label: endpoint
    replacement: http-metrics
    action: replace
  - source_labels: [__address__]
    separator: ;
    regex: (.*)
    modulus: 1
    target_label: __tmp_hash
    replacement: $1
    action: hashmod
  - source_labels: [__tmp_hash]
    separator: ;
    regex: "0"
    replacement: $1
    action: keep
  kubernetes_sd_configs:
  - role: endpoints
    kubeconfig_file: ""
    follow_redirects: true
    enable_http2: true
    namespaces:
      own_namespace: false
      names:
      - kube-system
- job_name: serviceMonitor/aml-monitoring/aml-monitoring-kube-promet-kubelet/0
  honor_labels: true
  honor_timestamps: true
  scrape_interval: 30s
  scrape_timeout: 10s
  metrics_path: /metrics
  scheme: https
  authorization:
    type: Bearer
    credentials_file: /var/run/secrets/kubernetes.io/serviceaccount/token
  tls_config:
    ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
    insecure_skip_verify: true
  follow_redirects: true
  enable_http2: true
  relabel_configs:
  - source_labels: [job]
    separator: ;
    regex: (.*)
    target_label: __tmp_prometheus_job_name
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_service_label_app_kubernetes_io_name, __meta_kubernetes_service_labelpresent_app_kubernetes_io_name]
    separator: ;
    regex: (kubelet);true
    replacement: $1
    action: keep
  - source_labels: [__meta_kubernetes_service_label_k8s_app, __meta_kubernetes_service_labelpresent_k8s_app]
    separator: ;
    regex: (kubelet);true
    replacement: $1
    action: keep
  - source_labels: [__meta_kubernetes_endpoint_port_name]
    separator: ;
    regex: https-metrics
    replacement: $1
    action: keep
  - source_labels: [__meta_kubernetes_endpoint_address_target_kind, __meta_kubernetes_endpoint_address_target_name]
    separator: ;
    regex: Node;(.*)
    target_label: node
    replacement: ${1}
    action: replace
  - source_labels: [__meta_kubernetes_endpoint_address_target_kind, __meta_kubernetes_endpoint_address_target_name]
    separator: ;
    regex: Pod;(.*)
    target_label: pod
    replacement: ${1}
    action: replace
  - source_labels: [__meta_kubernetes_namespace]
    separator: ;
    regex: (.*)
    target_label: namespace
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_service_name]
    separator: ;
    regex: (.*)
    target_label: service
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_pod_name]
    separator: ;
    regex: (.*)
    target_label: pod
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_pod_container_name]
    separator: ;
    regex: (.*)
    target_label: container
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_pod_phase]
    separator: ;
    regex: (Failed|Succeeded)
    replacement: $1
    action: drop
  - source_labels: [__meta_kubernetes_service_name]
    separator: ;
    regex: (.*)
    target_label: job
    replacement: ${1}
    action: replace
  - source_labels: [__meta_kubernetes_service_label_k8s_app]
    separator: ;
    regex: (.+)
    target_label: job
    replacement: ${1}
    action: replace
  - separator: ;
    regex: (.*)
    target_label: endpoint
    replacement: https-metrics
    action: replace
  - source_labels: [__metrics_path__]
    separator: ;
    regex: (.*)
    target_label: metrics_path
    replacement: $1
    action: replace
  - source_labels: [__address__]
    separator: ;
    regex: (.*)
    modulus: 1
    target_label: __tmp_hash
    replacement: $1
    action: hashmod
  - source_labels: [__tmp_hash]
    separator: ;
    regex: "0"
    replacement: $1
    action: keep
  kubernetes_sd_configs:
  - role: endpoints
    kubeconfig_file: ""
    follow_redirects: true
    enable_http2: true
    namespaces:
      own_namespace: false
      names:
      - kube-system
- job_name: serviceMonitor/aml-monitoring/aml-monitoring-kube-promet-kubelet/1
  honor_labels: true
  honor_timestamps: true
  scrape_interval: 30s
  scrape_timeout: 10s
  metrics_path: /metrics/cadvisor
  scheme: https
  authorization:
    type: Bearer
    credentials_file: /var/run/secrets/kubernetes.io/serviceaccount/token
  tls_config:
    ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
    insecure_skip_verify: true
  follow_redirects: true
  enable_http2: true
  relabel_configs:
  - source_labels: [job]
    separator: ;
    regex: (.*)
    target_label: __tmp_prometheus_job_name
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_service_label_app_kubernetes_io_name, __meta_kubernetes_service_labelpresent_app_kubernetes_io_name]
    separator: ;
    regex: (kubelet);true
    replacement: $1
    action: keep
  - source_labels: [__meta_kubernetes_service_label_k8s_app, __meta_kubernetes_service_labelpresent_k8s_app]
    separator: ;
    regex: (kubelet);true
    replacement: $1
    action: keep
  - source_labels: [__meta_kubernetes_endpoint_port_name]
    separator: ;
    regex: https-metrics
    replacement: $1
    action: keep
  - source_labels: [__meta_kubernetes_endpoint_address_target_kind, __meta_kubernetes_endpoint_address_target_name]
    separator: ;
    regex: Node;(.*)
    target_label: node
    replacement: ${1}
    action: replace
  - source_labels: [__meta_kubernetes_endpoint_address_target_kind, __meta_kubernetes_endpoint_address_target_name]
    separator: ;
    regex: Pod;(.*)
    target_label: pod
    replacement: ${1}
    action: replace
  - source_labels: [__meta_kubernetes_namespace]
    separator: ;
    regex: (.*)
    target_label: namespace
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_service_name]
    separator: ;
    regex: (.*)
    target_label: service
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_pod_name]
    separator: ;
    regex: (.*)
    target_label: pod
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_pod_container_name]
    separator: ;
    regex: (.*)
    target_label: container
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_pod_phase]
    separator: ;
    regex: (Failed|Succeeded)
    replacement: $1
    action: drop
  - source_labels: [__meta_kubernetes_service_name]
    separator: ;
    regex: (.*)
    target_label: job
    replacement: ${1}
    action: replace
  - source_labels: [__meta_kubernetes_service_label_k8s_app]
    separator: ;
    regex: (.+)
    target_label: job
    replacement: ${1}
    action: replace
  - separator: ;
    regex: (.*)
    target_label: endpoint
    replacement: https-metrics
    action: replace
  - source_labels: [__metrics_path__]
    separator: ;
    regex: (.*)
    target_label: metrics_path
    replacement: $1
    action: replace
  - source_labels: [__address__]
    separator: ;
    regex: (.*)
    modulus: 1
    target_label: __tmp_hash
    replacement: $1
    action: hashmod
  - source_labels: [__tmp_hash]
    separator: ;
    regex: "0"
    replacement: $1
    action: keep
  metric_relabel_configs:
  - source_labels: [__name__]
    separator: ;
    regex: container_cpu_(cfs_throttled_seconds_total|load_average_10s|system_seconds_total|user_seconds_total)
    replacement: $1
    action: drop
  - source_labels: [__name__]
    separator: ;
    regex: container_fs_(io_current|io_time_seconds_total|io_time_weighted_seconds_total|reads_merged_total|sector_reads_total|sector_writes_total|writes_merged_total)
    replacement: $1
    action: drop
  - source_labels: [__name__]
    separator: ;
    regex: container_memory_(mapped_file|swap)
    replacement: $1
    action: drop
  - source_labels: [__name__]
    separator: ;
    regex: container_(file_descriptors|tasks_state|threads_max)
    replacement: $1
    action: drop
  - source_labels: [__name__]
    separator: ;
    regex: container_spec.*
    replacement: $1
    action: drop
  - source_labels: [id, pod]
    separator: ;
    regex: .+;
    replacement: $1
    action: drop
  kubernetes_sd_configs:
  - role: endpoints
    kubeconfig_file: ""
    follow_redirects: true
    enable_http2: true
    namespaces:
      own_namespace: false
      names:
      - kube-system
- job_name: serviceMonitor/aml-monitoring/aml-monitoring-kube-promet-kubelet/2
  honor_labels: true
  honor_timestamps: true
  scrape_interval: 30s
  scrape_timeout: 10s
  metrics_path: /metrics/probes
  scheme: https
  authorization:
    type: Bearer
    credentials_file: /var/run/secrets/kubernetes.io/serviceaccount/token
  tls_config:
    ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
    insecure_skip_verify: true
  follow_redirects: true
  enable_http2: true
  relabel_configs:
  - source_labels: [job]
    separator: ;
    regex: (.*)
    target_label: __tmp_prometheus_job_name
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_service_label_app_kubernetes_io_name, __meta_kubernetes_service_labelpresent_app_kubernetes_io_name]
    separator: ;
    regex: (kubelet);true
    replacement: $1
    action: keep
  - source_labels: [__meta_kubernetes_service_label_k8s_app, __meta_kubernetes_service_labelpresent_k8s_app]
    separator: ;
    regex: (kubelet);true
    replacement: $1
    action: keep
  - source_labels: [__meta_kubernetes_endpoint_port_name]
    separator: ;
    regex: https-metrics
    replacement: $1
    action: keep
  - source_labels: [__meta_kubernetes_endpoint_address_target_kind, __meta_kubernetes_endpoint_address_target_name]
    separator: ;
    regex: Node;(.*)
    target_label: node
    replacement: ${1}
    action: replace
  - source_labels: [__meta_kubernetes_endpoint_address_target_kind, __meta_kubernetes_endpoint_address_target_name]
    separator: ;
    regex: Pod;(.*)
    target_label: pod
    replacement: ${1}
    action: replace
  - source_labels: [__meta_kubernetes_namespace]
    separator: ;
    regex: (.*)
    target_label: namespace
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_service_name]
    separator: ;
    regex: (.*)
    target_label: service
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_pod_name]
    separator: ;
    regex: (.*)
    target_label: pod
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_pod_container_name]
    separator: ;
    regex: (.*)
    target_label: container
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_pod_phase]
    separator: ;
    regex: (Failed|Succeeded)
    replacement: $1
    action: drop
  - source_labels: [__meta_kubernetes_service_name]
    separator: ;
    regex: (.*)
    target_label: job
    replacement: ${1}
    action: replace
  - source_labels: [__meta_kubernetes_service_label_k8s_app]
    separator: ;
    regex: (.+)
    target_label: job
    replacement: ${1}
    action: replace
  - separator: ;
    regex: (.*)
    target_label: endpoint
    replacement: https-metrics
    action: replace
  - source_labels: [__metrics_path__]
    separator: ;
    regex: (.*)
    target_label: metrics_path
    replacement: $1
    action: replace
  - source_labels: [__address__]
    separator: ;
    regex: (.*)
    modulus: 1
    target_label: __tmp_hash
    replacement: $1
    action: hashmod
  - source_labels: [__tmp_hash]
    separator: ;
    regex: "0"
    replacement: $1
    action: keep
  kubernetes_sd_configs:
  - role: endpoints
    kubeconfig_file: ""
    follow_redirects: true
    enable_http2: true
    namespaces:
      own_namespace: false
      names:
      - kube-system
- job_name: serviceMonitor/aml-monitoring/aml-monitoring-kube-promet-operator/0
  honor_labels: true
  honor_timestamps: true
  scrape_interval: 30s
  scrape_timeout: 10s
  metrics_path: /metrics
  scheme: http
  follow_redirects: true
  enable_http2: true
  relabel_configs:
  - source_labels: [job]
    separator: ;
    regex: (.*)
    target_label: __tmp_prometheus_job_name
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_service_label_app, __meta_kubernetes_service_labelpresent_app]
    separator: ;
    regex: (kube-prometheus-stack-operator);true
    replacement: $1
    action: keep
  - source_labels: [__meta_kubernetes_service_label_release, __meta_kubernetes_service_labelpresent_release]
    separator: ;
    regex: (aml-monitoring);true
    replacement: $1
    action: keep
  - source_labels: [__meta_kubernetes_endpoint_port_name]
    separator: ;
    regex: http
    replacement: $1
    action: keep
  - source_labels: [__meta_kubernetes_endpoint_address_target_kind, __meta_kubernetes_endpoint_address_target_name]
    separator: ;
    regex: Node;(.*)
    target_label: node
    replacement: ${1}
    action: replace
  - source_labels: [__meta_kubernetes_endpoint_address_target_kind, __meta_kubernetes_endpoint_address_target_name]
    separator: ;
    regex: Pod;(.*)
    target_label: pod
    replacement: ${1}
    action: replace
  - source_labels: [__meta_kubernetes_namespace]
    separator: ;
    regex: (.*)
    target_label: namespace
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_service_name]
    separator: ;
    regex: (.*)
    target_label: service
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_pod_name]
    separator: ;
    regex: (.*)
    target_label: pod
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_pod_container_name]
    separator: ;
    regex: (.*)
    target_label: container
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_pod_phase]
    separator: ;
    regex: (Failed|Succeeded)
    replacement: $1
    action: drop
  - source_labels: [__meta_kubernetes_service_name]
    separator: ;
    regex: (.*)
    target_label: job
    replacement: ${1}
    action: replace
  - separator: ;
    regex: (.*)
    target_label: endpoint
    replacement: http
    action: replace
  - source_labels: [__address__]
    separator: ;
    regex: (.*)
    modulus: 1
    target_label: __tmp_hash
    replacement: $1
    action: hashmod
  - source_labels: [__tmp_hash]
    separator: ;
    regex: "0"
    replacement: $1
    action: keep
  kubernetes_sd_configs:
  - role: endpoints
    kubeconfig_file: ""
    follow_redirects: true
    enable_http2: true
    namespaces:
      own_namespace: false
      names:
      - aml-monitoring
- job_name: serviceMonitor/aml-monitoring/aml-monitoring-kube-promet-prometheus/0
  honor_timestamps: true
  scrape_interval: 30s
  scrape_timeout: 10s
  metrics_path: /metrics
  scheme: http
  follow_redirects: true
  enable_http2: true
  relabel_configs:
  - source_labels: [job]
    separator: ;
    regex: (.*)
    target_label: __tmp_prometheus_job_name
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_service_label_app, __meta_kubernetes_service_labelpresent_app]
    separator: ;
    regex: (kube-prometheus-stack-prometheus);true
    replacement: $1
    action: keep
  - source_labels: [__meta_kubernetes_service_label_release, __meta_kubernetes_service_labelpresent_release]
    separator: ;
    regex: (aml-monitoring);true
    replacement: $1
    action: keep
  - source_labels: [__meta_kubernetes_service_label_self_monitor, __meta_kubernetes_service_labelpresent_self_monitor]
    separator: ;
    regex: (true);true
    replacement: $1
    action: keep
  - source_labels: [__meta_kubernetes_endpoint_port_name]
    separator: ;
    regex: http-web
    replacement: $1
    action: keep
  - source_labels: [__meta_kubernetes_endpoint_address_target_kind, __meta_kubernetes_endpoint_address_target_name]
    separator: ;
    regex: Node;(.*)
    target_label: node
    replacement: ${1}
    action: replace
  - source_labels: [__meta_kubernetes_endpoint_address_target_kind, __meta_kubernetes_endpoint_address_target_name]
    separator: ;
    regex: Pod;(.*)
    target_label: pod
    replacement: ${1}
    action: replace
  - source_labels: [__meta_kubernetes_namespace]
    separator: ;
    regex: (.*)
    target_label: namespace
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_service_name]
    separator: ;
    regex: (.*)
    target_label: service
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_pod_name]
    separator: ;
    regex: (.*)
    target_label: pod
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_pod_container_name]
    separator: ;
    regex: (.*)
    target_label: container
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_pod_phase]
    separator: ;
    regex: (Failed|Succeeded)
    replacement: $1
    action: drop
  - source_labels: [__meta_kubernetes_service_name]
    separator: ;
    regex: (.*)
    target_label: job
    replacement: ${1}
    action: replace
  - separator: ;
    regex: (.*)
    target_label: endpoint
    replacement: http-web
    action: replace
  - source_labels: [__address__]
    separator: ;
    regex: (.*)
    modulus: 1
    target_label: __tmp_hash
    replacement: $1
    action: hashmod
  - source_labels: [__tmp_hash]
    separator: ;
    regex: "0"
    replacement: $1
    action: keep
  kubernetes_sd_configs:
  - role: endpoints
    kubeconfig_file: ""
    follow_redirects: true
    enable_http2: true
    namespaces:
      own_namespace: false
      names:
      - aml-monitoring
- job_name: serviceMonitor/aml-monitoring/aml-monitoring-kube-state-metrics/0
  honor_labels: true
  honor_timestamps: true
  scrape_interval: 30s
  scrape_timeout: 10s
  metrics_path: /metrics
  scheme: http
  follow_redirects: true
  enable_http2: true
  relabel_configs:
  - source_labels: [job]
    separator: ;
    regex: (.*)
    target_label: __tmp_prometheus_job_name
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_service_label_app_kubernetes_io_instance, __meta_kubernetes_service_labelpresent_app_kubernetes_io_instance]
    separator: ;
    regex: (aml-monitoring);true
    replacement: $1
    action: keep
  - source_labels: [__meta_kubernetes_service_label_app_kubernetes_io_name, __meta_kubernetes_service_labelpresent_app_kubernetes_io_name]
    separator: ;
    regex: (kube-state-metrics);true
    replacement: $1
    action: keep
  - source_labels: [__meta_kubernetes_endpoint_port_name]
    separator: ;
    regex: http
    replacement: $1
    action: keep
  - source_labels: [__meta_kubernetes_endpoint_address_target_kind, __meta_kubernetes_endpoint_address_target_name]
    separator: ;
    regex: Node;(.*)
    target_label: node
    replacement: ${1}
    action: replace
  - source_labels: [__meta_kubernetes_endpoint_address_target_kind, __meta_kubernetes_endpoint_address_target_name]
    separator: ;
    regex: Pod;(.*)
    target_label: pod
    replacement: ${1}
    action: replace
  - source_labels: [__meta_kubernetes_namespace]
    separator: ;
    regex: (.*)
    target_label: namespace
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_service_name]
    separator: ;
    regex: (.*)
    target_label: service
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_pod_name]
    separator: ;
    regex: (.*)
    target_label: pod
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_pod_container_name]
    separator: ;
    regex: (.*)
    target_label: container
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_pod_phase]
    separator: ;
    regex: (Failed|Succeeded)
    replacement: $1
    action: drop
  - source_labels: [__meta_kubernetes_service_name]
    separator: ;
    regex: (.*)
    target_label: job
    replacement: ${1}
    action: replace
  - source_labels: [__meta_kubernetes_service_label_app_kubernetes_io_name]
    separator: ;
    regex: (.+)
    target_label: job
    replacement: ${1}
    action: replace
  - separator: ;
    regex: (.*)
    target_label: endpoint
    replacement: http
    action: replace
  - source_labels: [__address__]
    separator: ;
    regex: (.*)
    modulus: 1
    target_label: __tmp_hash
    replacement: $1
    action: hashmod
  - source_labels: [__tmp_hash]
    separator: ;
    regex: "0"
    replacement: $1
    action: keep
  kubernetes_sd_configs:
  - role: endpoints
    kubeconfig_file: ""
    follow_redirects: true
    enable_http2: true
    namespaces:
      own_namespace: false
      names:
      - aml-monitoring
- job_name: serviceMonitor/aml-monitoring/aml-monitoring-loki/0
  honor_timestamps: true
  scrape_interval: 15s
  scrape_timeout: 10s
  metrics_path: /metrics
  scheme: http
  follow_redirects: true
  enable_http2: true
  relabel_configs:
  - source_labels: [job]
    separator: ;
    regex: (.*)
    target_label: __tmp_prometheus_job_name
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_service_label_app_kubernetes_io_instance, __meta_kubernetes_service_labelpresent_app_kubernetes_io_instance]
    separator: ;
    regex: (aml-monitoring);true
    replacement: $1
    action: keep
  - source_labels: [__meta_kubernetes_service_label_app_kubernetes_io_name, __meta_kubernetes_service_labelpresent_app_kubernetes_io_name]
    separator: ;
    regex: (loki);true
    replacement: $1
    action: keep
  - source_labels: [__meta_kubernetes_service_label_prometheus_io_service_monitor,
      __meta_kubernetes_service_labelpresent_prometheus_io_service_monitor]
    separator: ;
    regex: (false);true
    replacement: $1
    action: drop
  - source_labels: [__meta_kubernetes_endpoint_port_name]
    separator: ;
    regex: http-metrics
    replacement: $1
    action: keep
  - source_labels: [__meta_kubernetes_endpoint_address_target_kind, __meta_kubernetes_endpoint_address_target_name]
    separator: ;
    regex: Node;(.*)
    target_label: node
    replacement: ${1}
    action: replace
  - source_labels: [__meta_kubernetes_endpoint_address_target_kind, __meta_kubernetes_endpoint_address_target_name]
    separator: ;
    regex: Pod;(.*)
    target_label: pod
    replacement: ${1}
    action: replace
  - source_labels: [__meta_kubernetes_namespace]
    separator: ;
    regex: (.*)
    target_label: namespace
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_service_name]
    separator: ;
    regex: (.*)
    target_label: service
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_pod_name]
    separator: ;
    regex: (.*)
    target_label: pod
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_pod_container_name]
    separator: ;
    regex: (.*)
    target_label: container
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_pod_phase]
    separator: ;
    regex: (Failed|Succeeded)
    replacement: $1
    action: drop
  - source_labels: [__meta_kubernetes_service_name]
    separator: ;
    regex: (.*)
    target_label: job
    replacement: ${1}
    action: replace
  - separator: ;
    regex: (.*)
    target_label: endpoint
    replacement: http-metrics
    action: replace
  - source_labels: [job]
    separator: ;
    regex: (.*)
    target_label: job
    replacement: aml-monitoring/$1
    action: replace
  - separator: ;
    regex: (.*)
    target_label: cluster
    replacement: aml-monitoring-loki
    action: replace
  - source_labels: [__address__]
    separator: ;
    regex: (.*)
    modulus: 1
    target_label: __tmp_hash
    replacement: $1
    action: hashmod
  - source_labels: [__tmp_hash]
    separator: ;
    regex: "0"
    replacement: $1
    action: keep
  kubernetes_sd_configs:
  - role: endpoints
    kubeconfig_file: ""
    follow_redirects: true
    enable_http2: true
    namespaces:
      own_namespace: false
      names:
      - aml-monitoring
- job_name: serviceMonitor/aml-monitoring/aml-monitoring-prometheus-node-exporter/0
  honor_timestamps: true
  scrape_interval: 30s
  scrape_timeout: 10s
  metrics_path: /metrics
  scheme: http
  follow_redirects: true
  enable_http2: true
  relabel_configs:
  - source_labels: [job]
    separator: ;
    regex: (.*)
    target_label: __tmp_prometheus_job_name
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_service_label_app_kubernetes_io_instance, __meta_kubernetes_service_labelpresent_app_kubernetes_io_instance]
    separator: ;
    regex: (aml-monitoring);true
    replacement: $1
    action: keep
  - source_labels: [__meta_kubernetes_service_label_app_kubernetes_io_name, __meta_kubernetes_service_labelpresent_app_kubernetes_io_name]
    separator: ;
    regex: (prometheus-node-exporter);true
    replacement: $1
    action: keep
  - source_labels: [__meta_kubernetes_endpoint_port_name]
    separator: ;
    regex: http-metrics
    replacement: $1
    action: keep
  - source_labels: [__meta_kubernetes_endpoint_address_target_kind, __meta_kubernetes_endpoint_address_target_name]
    separator: ;
    regex: Node;(.*)
    target_label: node
    replacement: ${1}
    action: replace
  - source_labels: [__meta_kubernetes_endpoint_address_target_kind, __meta_kubernetes_endpoint_address_target_name]
    separator: ;
    regex: Pod;(.*)
    target_label: pod
    replacement: ${1}
    action: replace
  - source_labels: [__meta_kubernetes_namespace]
    separator: ;
    regex: (.*)
    target_label: namespace
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_service_name]
    separator: ;
    regex: (.*)
    target_label: service
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_pod_name]
    separator: ;
    regex: (.*)
    target_label: pod
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_pod_container_name]
    separator: ;
    regex: (.*)
    target_label: container
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_pod_phase]
    separator: ;
    regex: (Failed|Succeeded)
    replacement: $1
    action: drop
  - source_labels: [__meta_kubernetes_service_name]
    separator: ;
    regex: (.*)
    target_label: job
    replacement: ${1}
    action: replace
  - source_labels: [__meta_kubernetes_service_label_jobLabel]
    separator: ;
    regex: (.+)
    target_label: job
    replacement: ${1}
    action: replace
  - separator: ;
    regex: (.*)
    target_label: endpoint
    replacement: http-metrics
    action: replace
  - source_labels: [__address__]
    separator: ;
    regex: (.*)
    modulus: 1
    target_label: __tmp_hash
    replacement: $1
    action: hashmod
  - source_labels: [__tmp_hash]
    separator: ;
    regex: "0"
    replacement: $1
    action: keep
  kubernetes_sd_configs:
  - role: endpoints
    kubeconfig_file: ""
    follow_redirects: true
    enable_http2: true
    namespaces:
      own_namespace: false
      names:
      - aml-monitoring
storage:
  tsdb:
    outofordertimewindow: 0

Prometheus logs:

ts=2023-07-04T08:35:02.327Z caller=main.go:575 level=info msg="Starting Prometheus Server" mode=server version="(version=2.44.0, branch=HEAD, revision=1ac5131f698ebc60f13fe2727f89b115a41f6558)"
ts=2023-07-04T08:35:02.327Z caller=main.go:580 level=info build_context="(go=go1.20.4, platform=linux/amd64, user=root@739e8181c5db, date=20230514-06:18:11, tags=netgo,builtinassets,stringlabels)"
ts=2023-07-04T08:35:02.327Z caller=main.go:581 level=info host_details="(Linux 5.10.165-143.735.amzn2.x86_64 #1 SMP Wed Jan 25 03:13:54 UTC 2023 x86_64 prometheus-aml-monitoring-kube-promet-0 (none))"
ts=2023-07-04T08:35:02.327Z caller=main.go:582 level=info fd_limits="(soft=1048576, hard=1048576)"
ts=2023-07-04T08:35:02.327Z caller=main.go:583 level=info vm_limits="(soft=unlimited, hard=unlimited)"
ts=2023-07-04T08:35:02.329Z caller=web.go:562 level=info component=web msg="Start listening for connections" address=0.0.0.0:9090
ts=2023-07-04T08:35:02.330Z caller=main.go:1016 level=info msg="Starting TSDB ..."
ts=2023-07-04T08:35:02.330Z caller=repair.go:56 level=info component=tsdb msg="Found healthy block" mint=1688427234695 maxt=1688428800000 ulid=01H4FBZNDN57KQG1HJZK21V8EE
ts=2023-07-04T08:35:02.331Z caller=tls_config.go:232 level=info component=web msg="Listening on" address=[::]:9090
ts=2023-07-04T08:35:02.331Z caller=tls_config.go:271 level=info component=web msg="TLS is disabled." http2=false address=[::]:9090
ts=2023-07-04T08:35:02.331Z caller=repair.go:56 level=info component=tsdb msg="Found healthy block" mint=1688428800005 maxt=1688436000000 ulid=01H4FDFDGFMMTYGW48SRP44TE6
ts=2023-07-04T08:35:02.331Z caller=repair.go:56 level=info component=tsdb msg="Found healthy block" mint=1688436000061 maxt=1688443200000 ulid=01H4FMB4RK744Q9DQF9E84S8DV
ts=2023-07-04T08:35:02.332Z caller=repair.go:56 level=info component=tsdb msg="Found healthy block" mint=1688443200036 maxt=1688450400000 ulid=01H4FV6W0E01XDREYJV6WBPBBN
ts=2023-07-04T08:35:02.372Z caller=head.go:588 level=info component=tsdb msg="Replaying on-disk memory mappable chunks if any"
ts=2023-07-04T08:35:02.682Z caller=head.go:669 level=info component=tsdb msg="On-disk memory mappable chunks replay completed" duration=309.880213ms
ts=2023-07-04T08:35:02.682Z caller=head.go:677 level=info component=tsdb msg="Replaying WAL, this may take a while"
ts=2023-07-04T08:35:02.982Z caller=head.go:713 level=info component=tsdb msg="WAL checkpoint loaded"
ts=2023-07-04T08:35:04.106Z caller=head.go:748 level=info component=tsdb msg="WAL segment loaded" segment=5 maxSegment=8
ts=2023-07-04T08:35:04.336Z caller=head.go:748 level=info component=tsdb msg="WAL segment loaded" segment=6 maxSegment=8
ts=2023-07-04T08:35:05.523Z caller=head.go:748 level=info component=tsdb msg="WAL segment loaded" segment=7 maxSegment=8
ts=2023-07-04T08:35:05.524Z caller=head.go:748 level=info component=tsdb msg="WAL segment loaded" segment=8 maxSegment=8
ts=2023-07-04T08:35:05.524Z caller=head.go:785 level=info component=tsdb msg="WAL replay completed" checkpoint_replay_duration=299.724971ms wal_replay_duration=2.541787971s wbl_replay_duration=250ns total_replay_duration=3.151454206s
ts=2023-07-04T08:35:05.597Z caller=main.go:1037 level=info fs_type=EXT4_SUPER_MAGIC
ts=2023-07-04T08:35:05.597Z caller=main.go:1040 level=info msg="TSDB started"
ts=2023-07-04T08:35:05.597Z caller=main.go:1041 level=debug msg="TSDB options" MinBlockDuration=2h MaxBlockDuration=1d9h36m MaxBytes=0B NoLockfile=false RetentionDuration=2w WALSegmentSize=0B WALCompression=true
ts=2023-07-04T08:35:05.597Z caller=main.go:1220 level=info msg="Loading configuration file" filename=/etc/prometheus/config_out/prometheus.env.yaml
ts=2023-07-04T08:35:05.600Z caller=kubernetes.go:327 level=info component="discovery manager scrape" discovery=kubernetes config=serviceMonitor/aml-monitoring/aml-monitoring-prometheus-node-exporter/0 msg="Using pod service account via in-cluster config"
ts=2023-07-04T08:35:05.600Z caller=kubernetes.go:327 level=info component="discovery manager scrape" discovery=kubernetes config=serviceMonitor/aml-monitoring/aml-monitoring-kube-promet-apiserver/0 msg="Using pod service account via in-cluster config"
ts=2023-07-04T08:35:05.601Z caller=kubernetes.go:327 level=info component="discovery manager scrape" discovery=kubernetes config=serviceMonitor/aml-monitoring/aml-monitoring-kube-promet-kube-etcd/0 msg="Using pod service account via in-cluster config"
ts=2023-07-04T08:35:05.601Z caller=manager.go:196 level=debug component="discovery manager scrape" msg="Starting provider" provider=kubernetes/0 subs="[serviceMonitor/aml-monitoring/aml-monitoring-prometheus-node-exporter/0 serviceMonitor/aml-monitoring/aml-monitoring-kube-promet-alertmanager/0 serviceMonitor/aml-monitoring/aml-monitoring-kube-state-metrics/0 serviceMonitor/aml-monitoring/aml-monitoring-kube-promet-operator/0 serviceMonitor/aml-monitoring/aml-monitoring-kube-promet-prometheus/0 serviceMonitor/aml-monitoring/aml-monitoring-loki/0]"
ts=2023-07-04T08:35:05.601Z caller=manager.go:196 level=debug component="discovery manager scrape" msg="Starting provider" provider=kubernetes/1 subs=[serviceMonitor/aml-monitoring/aml-monitoring-kube-promet-apiserver/0]
ts=2023-07-04T08:35:05.601Z caller=manager.go:196 level=debug component="discovery manager scrape" msg="Starting provider" provider=kubernetes/2 subs="[serviceMonitor/aml-monitoring/aml-monitoring-kube-promet-kube-etcd/0 serviceMonitor/aml-monitoring/aml-monitoring-kube-promet-kubelet/1 serviceMonitor/aml-monitoring/aml-monitoring-kube-promet-kubelet/2 serviceMonitor/aml-monitoring/aml-monitoring-kube-promet-coredns/0 serviceMonitor/aml-monitoring/aml-monitoring-kube-promet-kubelet/0]"
ts=2023-07-04T08:35:05.601Z caller=kubernetes.go:327 level=info component="discovery manager notify" discovery=kubernetes config=config-0 msg="Using pod service account via in-cluster config"
ts=2023-07-04T08:35:05.601Z caller=klog.go:55 level=debug component=k8s_client_runtime func=Verbose.Infof msg="Starting reflector *v1.Pod (0s) from pkg/mod/k8s.io/client-go@v0.26.2/tools/cache/reflector.go:169"
ts=2023-07-04T08:35:05.601Z caller=manager.go:196 level=debug component="discovery manager notify" msg="Starting provider" provider=kubernetes/0 subs=[config-0]
ts=2023-07-04T08:35:05.601Z caller=klog.go:55 level=debug component=k8s_client_runtime func=Verbose.Infof msg="Starting reflector *v1.Pod (0s) from pkg/mod/k8s.io/client-go@v0.26.2/tools/cache/reflector.go:169"
ts=2023-07-04T08:35:05.601Z caller=klog.go:55 level=debug component=k8s_client_runtime func=Verbose.Infof msg="Starting reflector *v1.Service (0s) from pkg/mod/k8s.io/client-go@v0.26.2/tools/cache/reflector.go:169"
ts=2023-07-04T08:35:05.601Z caller=klog.go:55 level=debug component=k8s_client_runtime func=Verbose.Infof msg="Starting reflector *v1.Service (0s) from pkg/mod/k8s.io/client-go@v0.26.2/tools/cache/reflector.go:169"
ts=2023-07-04T08:35:05.601Z caller=klog.go:55 level=debug component=k8s_client_runtime func=Verbose.Infof msg="Listing and watching *v1.Pod from pkg/mod/k8s.io/client-go@v0.26.2/tools/cache/reflector.go:169"
ts=2023-07-04T08:35:05.601Z caller=klog.go:55 level=debug component=k8s_client_runtime func=Verbose.Infof msg="Starting reflector *v1.Pod (0s) from pkg/mod/k8s.io/client-go@v0.26.2/tools/cache/reflector.go:169"
ts=2023-07-04T08:35:05.601Z caller=klog.go:55 level=debug component=k8s_client_runtime func=Verbose.Infof msg="Listing and watching *v1.Service from pkg/mod/k8s.io/client-go@v0.26.2/tools/cache/reflector.go:169"
ts=2023-07-04T08:35:05.601Z caller=klog.go:55 level=debug component=k8s_client_runtime func=Verbose.Infof msg="Listing and watching *v1.Pod from pkg/mod/k8s.io/client-go@v0.26.2/tools/cache/reflector.go:169"
ts=2023-07-04T08:35:05.601Z caller=klog.go:55 level=debug component=k8s_client_runtime func=Verbose.Infof msg="Starting reflector *v1.Service (0s) from pkg/mod/k8s.io/client-go@v0.26.2/tools/cache/reflector.go:169"
ts=2023-07-04T08:35:05.601Z caller=klog.go:55 level=debug component=k8s_client_runtime func=Verbose.Infof msg="Starting reflector *v1.Service (0s) from pkg/mod/k8s.io/client-go@v0.26.2/tools/cache/reflector.go:169"
ts=2023-07-04T08:35:05.601Z caller=klog.go:55 level=debug component=k8s_client_runtime func=Verbose.Infof msg="Listing and watching *v1.Service from pkg/mod/k8s.io/client-go@v0.26.2/tools/cache/reflector.go:169"
ts=2023-07-04T08:35:05.601Z caller=klog.go:55 level=debug component=k8s_client_runtime func=Verbose.Infof msg="Starting reflector *v1.Endpoints (0s) from pkg/mod/k8s.io/client-go@v0.26.2/tools/cache/reflector.go:169"
ts=2023-07-04T08:35:05.601Z caller=klog.go:55 level=debug component=k8s_client_runtime func=Verbose.Infof msg="Listing and watching *v1.Service from pkg/mod/k8s.io/client-go@v0.26.2/tools/cache/reflector.go:169"
ts=2023-07-04T08:35:05.601Z caller=klog.go:55 level=debug component=k8s_client_runtime func=Verbose.Infof msg="Listing and watching *v1.Service from pkg/mod/k8s.io/client-go@v0.26.2/tools/cache/reflector.go:169"
ts=2023-07-04T08:35:05.601Z caller=klog.go:55 level=debug component=k8s_client_runtime func=Verbose.Infof msg="Listing and watching *v1.Endpoints from pkg/mod/k8s.io/client-go@v0.26.2/tools/cache/reflector.go:169"
ts=2023-07-04T08:35:05.601Z caller=klog.go:55 level=debug component=k8s_client_runtime func=Verbose.Infof msg="Listing and watching *v1.Pod from pkg/mod/k8s.io/client-go@v0.26.2/tools/cache/reflector.go:169"
ts=2023-07-04T08:35:05.601Z caller=klog.go:55 level=debug component=k8s_client_runtime func=Verbose.Infof msg="Starting reflector *v1.Endpoints (0s) from pkg/mod/k8s.io/client-go@v0.26.2/tools/cache/reflector.go:169"
ts=2023-07-04T08:35:05.601Z caller=klog.go:55 level=debug component=k8s_client_runtime func=Verbose.Infof msg="Listing and watching *v1.Endpoints from pkg/mod/k8s.io/client-go@v0.26.2/tools/cache/reflector.go:169"
ts=2023-07-04T08:35:05.601Z caller=klog.go:55 level=debug component=k8s_client_runtime func=Verbose.Infof msg="Starting reflector *v1.Endpoints (0s) from pkg/mod/k8s.io/client-go@v0.26.2/tools/cache/reflector.go:169"
ts=2023-07-04T08:35:05.601Z caller=klog.go:55 level=debug component=k8s_client_runtime func=Verbose.Infof msg="Starting reflector *v1.Endpoints (0s) from pkg/mod/k8s.io/client-go@v0.26.2/tools/cache/reflector.go:169"
ts=2023-07-04T08:35:05.601Z caller=klog.go:55 level=debug component=k8s_client_runtime func=Verbose.Infof msg="Listing and watching *v1.Endpoints from pkg/mod/k8s.io/client-go@v0.26.2/tools/cache/reflector.go:169"
ts=2023-07-04T08:35:05.601Z caller=klog.go:55 level=debug component=k8s_client_runtime func=Verbose.Infof msg="Starting reflector *v1.Pod (0s) from pkg/mod/k8s.io/client-go@v0.26.2/tools/cache/reflector.go:169"
ts=2023-07-04T08:35:05.601Z caller=klog.go:55 level=debug component=k8s_client_runtime func=Verbose.Infof msg="Listing and watching *v1.Pod from pkg/mod/k8s.io/client-go@v0.26.2/tools/cache/reflector.go:169"
ts=2023-07-04T08:35:05.601Z caller=klog.go:55 level=debug component=k8s_client_runtime func=Verbose.Infof msg="Listing and watching *v1.Endpoints from pkg/mod/k8s.io/client-go@v0.26.2/tools/cache/reflector.go:169"
ts=2023-07-04T08:35:05.615Z caller=klog.go:84 level=debug component=k8s_client_runtime func=Infof msg="GET https://10.101.0.1:443/api/v1/namespaces/aml-monitoring/pods?limit=500&resourceVersion=0 200 OK in 13 milliseconds"
ts=2023-07-04T08:35:05.615Z caller=klog.go:84 level=debug component=k8s_client_runtime func=Infof msg="GET https://10.101.0.1:443/api/v1/namespaces/kube-system/endpoints?limit=500&resourceVersion=0 200 OK in 13 milliseconds"
ts=2023-07-04T08:35:05.615Z caller=klog.go:84 level=debug component=k8s_client_runtime func=Infof msg="GET https://10.101.0.1:443/api/v1/namespaces/aml-monitoring/endpoints?limit=500&resourceVersion=0 200 OK in 14 milliseconds"
ts=2023-07-04T08:35:05.616Z caller=klog.go:84 level=debug component=k8s_client_runtime func=Infof msg="GET https://10.101.0.1:443/api/v1/namespaces/aml-monitoring/services?limit=500&resourceVersion=0 200 OK in 14 milliseconds"
ts=2023-07-04T08:35:05.618Z caller=klog.go:84 level=debug component=k8s_client_runtime func=Infof msg="GET https://10.101.0.1:443/api/v1/namespaces/kube-system/pods?limit=500&resourceVersion=0 200 OK in 16 milliseconds"
ts=2023-07-04T08:35:05.618Z caller=klog.go:84 level=debug component=k8s_client_runtime func=Infof msg="GET https://10.101.0.1:443/api/v1/namespaces/default/pods?limit=500&resourceVersion=0 200 OK in 16 milliseconds"
ts=2023-07-04T08:35:05.619Z caller=klog.go:84 level=debug component=k8s_client_runtime func=Infof msg="GET https://10.101.0.1:443/api/v1/namespaces/default/services?limit=500&resourceVersion=0 200 OK in 17 milliseconds"
ts=2023-07-04T08:35:05.619Z caller=klog.go:84 level=debug component=k8s_client_runtime func=Infof msg="GET https://10.101.0.1:443/api/v1/namespaces/aml-monitoring/services?limit=500&resourceVersion=0 200 OK in 17 milliseconds"
ts=2023-07-04T08:35:05.621Z caller=klog.go:84 level=debug component=k8s_client_runtime func=Infof msg="GET https://10.101.0.1:443/api/v1/namespaces/kube-system/services?limit=500&resourceVersion=0 200 OK in 19 milliseconds"
ts=2023-07-04T08:35:05.621Z caller=klog.go:84 level=debug component=k8s_client_runtime func=Infof msg="GET https://10.101.0.1:443/api/v1/namespaces/aml-monitoring/pods?limit=500&resourceVersion=0 200 OK in 19 milliseconds"
ts=2023-07-04T08:35:05.622Z caller=klog.go:84 level=debug component=k8s_client_runtime func=Infof msg="GET https://10.101.0.1:443/api/v1/namespaces/default/endpoints?limit=500&resourceVersion=0 200 OK in 20 milliseconds"
ts=2023-07-04T08:35:05.622Z caller=klog.go:84 level=debug component=k8s_client_runtime func=Infof msg="GET https://10.101.0.1:443/api/v1/namespaces/aml-monitoring/endpoints?limit=500&resourceVersion=0 200 OK in 20 milliseconds"
ts=2023-07-04T08:35:05.625Z caller=klog.go:84 level=debug component=k8s_client_runtime func=Infof msg="GET https://10.101.0.1:443/api/v1/namespaces/default/services?allowWatchBookmarks=true&resourceVersion=106603667&timeout=9m14s&timeoutSeconds=554&watch=true 200 OK in 3 milliseconds"
ts=2023-07-04T08:35:05.625Z caller=klog.go:84 level=debug component=k8s_client_runtime func=Infof msg="GET https://10.101.0.1:443/api/v1/namespaces/default/pods?allowWatchBookmarks=true&resourceVersion=106603657&timeout=9m18s&timeoutSeconds=558&watch=true 200 OK in 3 milliseconds"
ts=2023-07-04T08:35:05.625Z caller=klog.go:84 level=debug component=k8s_client_runtime func=Infof msg="GET https://10.101.0.1:443/api/v1/namespaces/default/endpoints?allowWatchBookmarks=true&resourceVersion=106603650&timeout=5m29s&timeoutSeconds=329&watch=true 200 OK in 3 milliseconds"
ts=2023-07-04T08:35:05.626Z caller=klog.go:84 level=debug component=k8s_client_runtime func=Infof msg="GET https://10.101.0.1:443/api/v1/namespaces/kube-system/endpoints?allowWatchBookmarks=true&resourceVersion=106603650&timeout=9m39s&timeoutSeconds=579&watch=true 200 OK in 3 milliseconds"
ts=2023-07-04T08:35:05.626Z caller=klog.go:84 level=debug component=k8s_client_runtime func=Infof msg="GET https://10.101.0.1:443/api/v1/namespaces/aml-monitoring/services?allowWatchBookmarks=true&resourceVersion=106603667&timeout=5m20s&timeoutSeconds=320&watch=true 200 OK in 3 milliseconds"
ts=2023-07-04T08:35:05.626Z caller=klog.go:84 level=debug component=k8s_client_runtime func=Infof msg="GET https://10.101.0.1:443/api/v1/namespaces/kube-system/services?allowWatchBookmarks=true&resourceVersion=106603667&timeout=6m40s&timeoutSeconds=400&watch=true 200 OK in 3 milliseconds"
ts=2023-07-04T08:35:05.626Z caller=klog.go:84 level=debug component=k8s_client_runtime func=Infof msg="GET https://10.101.0.1:443/api/v1/namespaces/aml-monitoring/services?allowWatchBookmarks=true&resourceVersion=106603667&timeout=7m53s&timeoutSeconds=473&watch=true 200 OK in 4 milliseconds"
ts=2023-07-04T08:35:05.627Z caller=klog.go:84 level=debug component=k8s_client_runtime func=Infof msg="GET https://10.101.0.1:443/api/v1/namespaces/aml-monitoring/endpoints?allowWatchBookmarks=true&resourceVersion=106603650&timeout=5m26s&timeoutSeconds=326&watch=true 200 OK in 3 milliseconds"
ts=2023-07-04T08:35:05.627Z caller=klog.go:84 level=debug component=k8s_client_runtime func=Infof msg="GET https://10.101.0.1:443/api/v1/namespaces/kube-system/pods?allowWatchBookmarks=true&resourceVersion=106603657&timeout=7m58s&timeoutSeconds=478&watch=true 200 OK in 1 milliseconds"
ts=2023-07-04T08:35:05.627Z caller=klog.go:84 level=debug component=k8s_client_runtime func=Infof msg="GET https://10.101.0.1:443/api/v1/namespaces/aml-monitoring/endpoints?allowWatchBookmarks=true&resourceVersion=106603650&timeout=8m37s&timeoutSeconds=517&watch=true 200 OK in 4 milliseconds"
ts=2023-07-04T08:35:05.627Z caller=klog.go:84 level=debug component=k8s_client_runtime func=Infof msg="GET https://10.101.0.1:443/api/v1/namespaces/aml-monitoring/pods?allowWatchBookmarks=true&resourceVersion=106603657&timeout=6m50s&timeoutSeconds=410&watch=true 200 OK in 2 milliseconds"
ts=2023-07-04T08:35:05.627Z caller=klog.go:84 level=debug component=k8s_client_runtime func=Infof msg="GET https://10.101.0.1:443/api/v1/namespaces/aml-monitoring/pods?allowWatchBookmarks=true&resourceVersion=106603657&timeout=5m39s&timeoutSeconds=339&watch=true 200 OK in 2 milliseconds"
ts=2023-07-04T08:35:05.633Z caller=main.go:1257 level=info msg="Completed loading of configuration file" filename=/etc/prometheus/config_out/prometheus.env.yaml totalDuration=35.77154ms db_storage=1.161µs remote_storage=1.46µs web_handler=580ns query_engine=1.191µs scrape=187.746µs scrape_sd=669.838µs notify=20.95µs notify_sd=235.996µs rules=31.6451ms tracing=3.87µs
ts=2023-07-04T08:35:05.633Z caller=main.go:1001 level=info msg="Server is ready to receive web requests."
ts=2023-07-04T08:35:05.633Z caller=main.go:1220 level=info msg="Loading configuration file" filename=/etc/prometheus/config_out/prometheus.env.yaml
ts=2023-07-04T08:35:05.633Z caller=manager.go:995 level=info component="rule manager" msg="Starting rule manager..."
ts=2023-07-04T08:35:05.636Z caller=klog.go:55 level=debug component=k8s_client_runtime func=Verbose.Infof msg="Stopping reflector *v1.Pod (0s) from pkg/mod/k8s.io/client-go@v0.26.2/tools/cache/reflector.go:169"
ts=2023-07-04T08:35:05.636Z caller=klog.go:55 level=debug component=k8s_client_runtime func=Verbose.Infof msg="Stopping reflector *v1.Endpoints (0s) from pkg/mod/k8s.io/client-go@v0.26.2/tools/cache/reflector.go:169"
ts=2023-07-04T08:35:05.636Z caller=klog.go:55 level=debug component=k8s_client_runtime func=Verbose.Infof msg="stop requested"
ts=2023-07-04T08:35:05.636Z caller=klog.go:55 level=debug component=k8s_client_runtime func=Verbose.Infof msg="Stopping reflector *v1.Pod (0s) from pkg/mod/k8s.io/client-go@v0.26.2/tools/cache/reflector.go:169"
ts=2023-07-04T08:35:05.636Z caller=klog.go:55 level=debug component=k8s_client_runtime func=Verbose.Infof msg="Stopping reflector *v1.Endpoints (0s) from pkg/mod/k8s.io/client-go@v0.26.2/tools/cache/reflector.go:169"
ts=2023-07-04T08:35:05.636Z caller=klog.go:55 level=debug component=k8s_client_runtime func=Verbose.Infof msg="Stopping reflector *v1.Service (0s) from pkg/mod/k8s.io/client-go@v0.26.2/tools/cache/reflector.go:169"
ts=2023-07-04T08:35:05.636Z caller=klog.go:55 level=debug component=k8s_client_runtime func=Verbose.Infof msg="Stopping reflector *v1.Service (0s) from pkg/mod/k8s.io/client-go@v0.26.2/tools/cache/reflector.go:169"
ts=2023-07-04T08:35:05.636Z caller=kubernetes.go:327 level=info component="discovery manager scrape" discovery=kubernetes config=serviceMonitor/aml-monitoring/aml-monitoring-kube-promet-apiserver/0 msg="Using pod service account via in-cluster config"
ts=2023-07-04T08:35:05.636Z caller=klog.go:55 level=debug component=k8s_client_runtime func=Verbose.Infof msg="Stopping reflector *v1.Pod (0s) from pkg/mod/k8s.io/client-go@v0.26.2/tools/cache/reflector.go:169"
ts=2023-07-04T08:35:05.636Z caller=klog.go:55 level=debug component=k8s_client_runtime func=Verbose.Infof msg="Stopping reflector *v1.Endpoints (0s) from pkg/mod/k8s.io/client-go@v0.26.2/tools/cache/reflector.go:169"
ts=2023-07-04T08:35:05.636Z caller=klog.go:55 level=debug component=k8s_client_runtime func=Verbose.Infof msg="stop requested"
ts=2023-07-04T08:35:05.636Z caller=klog.go:55 level=debug component=k8s_client_runtime func=Verbose.Infof msg="stop requested"
ts=2023-07-04T08:35:05.636Z caller=klog.go:55 level=debug component=k8s_client_runtime func=Verbose.Infof msg="Stopping reflector *v1.Service (0s) from pkg/mod/k8s.io/client-go@v0.26.2/tools/cache/reflector.go:169"
ts=2023-07-04T08:35:05.636Z caller=kubernetes.go:327 level=info component="discovery manager scrape" discovery=kubernetes config=serviceMonitor/aml-monitoring/aml-monitoring-kube-promet-coredns/0 msg="Using pod service account via in-cluster config"
ts=2023-07-04T08:35:05.636Z caller=kubernetes.go:327 level=info component="discovery manager scrape" discovery=kubernetes config=serviceMonitor/aml-monitoring/aml-monitoring-kube-promet-operator/0 msg="Using pod service account via in-cluster config"
ts=2023-07-04T08:35:05.636Z caller=manager.go:196 level=debug component="discovery manager scrape" msg="Starting provider" provider=kubernetes/0 subs=[serviceMonitor/aml-monitoring/aml-monitoring-kube-promet-apiserver/0]
ts=2023-07-04T08:35:05.636Z caller=manager.go:196 level=debug component="discovery manager scrape" msg="Starting provider" provider=kubernetes/1 subs="[serviceMonitor/aml-monitoring/aml-monitoring-kube-promet-coredns/0 serviceMonitor/aml-monitoring/aml-monitoring-kube-promet-kubelet/1 serviceMonitor/aml-monitoring/aml-monitoring-kube-promet-kube-etcd/0 serviceMonitor/aml-monitoring/aml-monitoring-kube-promet-kubelet/0 serviceMonitor/aml-monitoring/aml-monitoring-kube-promet-kubelet/2]"
ts=2023-07-04T08:35:05.636Z caller=manager.go:196 level=debug component="discovery manager scrape" msg="Starting provider" provider=kubernetes/2 subs="[serviceMonitor/aml-monitoring/aml-monitoring-kube-promet-operator/0 serviceMonitor/aml-monitoring/aml-monitoring-kube-promet-prometheus/0 serviceMonitor/aml-monitoring/aml-monitoring-prometheus-node-exporter/0 serviceMonitor/aml-monitoring/aml-monitoring-kube-promet-alertmanager/0 serviceMonitor/aml-monitoring/aml-monitoring-kube-state-metrics/0 serviceMonitor/aml-monitoring/aml-monitoring-loki/0]"
ts=2023-07-04T08:35:05.636Z caller=klog.go:55 level=debug component=k8s_client_runtime func=Verbose.Infof msg="Starting reflector *v1.Service (0s) from pkg/mod/k8s.io/client-go@v0.26.2/tools/cache/reflector.go:169"
ts=2023-07-04T08:35:05.636Z caller=klog.go:55 level=debug component=k8s_client_runtime func=Verbose.Infof msg="Starting reflector *v1.Service (0s) from pkg/mod/k8s.io/client-go@v0.26.2/tools/cache/reflector.go:169"
ts=2023-07-04T08:35:05.636Z caller=klog.go:55 level=debug component=k8s_client_runtime func=Verbose.Infof msg="Starting reflector *v1.Pod (0s) from pkg/mod/k8s.io/client-go@v0.26.2/tools/cache/reflector.go:169"
ts=2023-07-04T08:35:05.636Z caller=klog.go:55 level=debug component=k8s_client_runtime func=Verbose.Infof msg="Starting reflector *v1.Pod (0s) from pkg/mod/k8s.io/client-go@v0.26.2/tools/cache/reflector.go:169"
ts=2023-07-04T08:35:05.636Z caller=klog.go:55 level=debug component=k8s_client_runtime func=Verbose.Infof msg="Stopping reflector *v1.Endpoints (0s) from pkg/mod/k8s.io/client-go@v0.26.2/tools/cache/reflector.go:169"
ts=2023-07-04T08:35:05.636Z caller=klog.go:55 level=debug component=k8s_client_runtime func=Verbose.Infof msg="Listing and watching *v1.Pod from pkg/mod/k8s.io/client-go@v0.26.2/tools/cache/reflector.go:169"
ts=2023-07-04T08:35:05.636Z caller=klog.go:55 level=debug component=k8s_client_runtime func=Verbose.Infof msg="Starting reflector *v1.Service (0s) from pkg/mod/k8s.io/client-go@v0.26.2/tools/cache/reflector.go:169"
ts=2023-07-04T08:35:05.636Z caller=klog.go:55 level=debug component=k8s_client_runtime func=Verbose.Infof msg="Starting reflector *v1.Pod (0s) from pkg/mod/k8s.io/client-go@v0.26.2/tools/cache/reflector.go:169"
ts=2023-07-04T08:35:05.636Z caller=klog.go:55 level=debug component=k8s_client_runtime func=Verbose.Infof msg="Listing and watching *v1.Pod from pkg/mod/k8s.io/client-go@v0.26.2/tools/cache/reflector.go:169"
ts=2023-07-04T08:35:05.636Z caller=klog.go:55 level=debug component=k8s_client_runtime func=Verbose.Infof msg="Starting reflector *v1.Endpoints (0s) from pkg/mod/k8s.io/client-go@v0.26.2/tools/cache/reflector.go:169"
ts=2023-07-04T08:35:05.636Z caller=klog.go:55 level=debug component=k8s_client_runtime func=Verbose.Infof msg="Listing and watching *v1.Pod from pkg/mod/k8s.io/client-go@v0.26.2/tools/cache/reflector.go:169"
ts=2023-07-04T08:35:05.636Z caller=klog.go:55 level=debug component=k8s_client_runtime func=Verbose.Infof msg="Listing and watching *v1.Endpoints from pkg/mod/k8s.io/client-go@v0.26.2/tools/cache/reflector.go:169"
ts=2023-07-04T08:35:05.636Z caller=kubernetes.go:327 level=info component="discovery manager notify" discovery=kubernetes config=config-0 msg="Using pod service account via in-cluster config"
ts=2023-07-04T08:35:05.636Z caller=klog.go:55 level=debug component=k8s_client_runtime func=Verbose.Infof msg="Stopping reflector *v1.Pod (0s) from pkg/mod/k8s.io/client-go@v0.26.2/tools/cache/reflector.go:169"
ts=2023-07-04T08:35:05.636Z caller=klog.go:55 level=debug component=k8s_client_runtime func=Verbose.Infof msg="Listing and watching *v1.Service from pkg/mod/k8s.io/client-go@v0.26.2/tools/cache/reflector.go:169"
ts=2023-07-04T08:35:05.636Z caller=klog.go:55 level=debug component=k8s_client_runtime func=Verbose.Infof msg="Listing and watching *v1.Service from pkg/mod/k8s.io/client-go@v0.26.2/tools/cache/reflector.go:169"
ts=2023-07-04T08:35:05.636Z caller=klog.go:55 level=debug component=k8s_client_runtime func=Verbose.Infof msg="Starting reflector *v1.Endpoints (0s) from pkg/mod/k8s.io/client-go@v0.26.2/tools/cache/reflector.go:169"
ts=2023-07-04T08:35:05.637Z caller=klog.go:55 level=debug component=k8s_client_runtime func=Verbose.Infof msg="Listing and watching *v1.Endpoints from pkg/mod/k8s.io/client-go@v0.26.2/tools/cache/reflector.go:169"
ts=2023-07-04T08:35:05.636Z caller=klog.go:55 level=debug component=k8s_client_runtime func=Verbose.Infof msg="Starting reflector *v1.Endpoints (0s) from pkg/mod/k8s.io/client-go@v0.26.2/tools/cache/reflector.go:169"
ts=2023-07-04T08:35:05.637Z caller=klog.go:55 level=debug component=k8s_client_runtime func=Verbose.Infof msg="Listing and watching *v1.Endpoints from pkg/mod/k8s.io/client-go@v0.26.2/tools/cache/reflector.go:169"
ts=2023-07-04T08:35:05.637Z caller=manager.go:196 level=debug component="discovery manager notify" msg="Starting provider" provider=kubernetes/0 subs=[config-0]
ts=2023-07-04T08:35:05.636Z caller=klog.go:55 level=debug component=k8s_client_runtime func=Verbose.Infof msg="Stopping reflector *v1.Service (0s) from pkg/mod/k8s.io/client-go@v0.26.2/tools/cache/reflector.go:169"
ts=2023-07-04T08:35:05.636Z caller=klog.go:55 level=debug component=k8s_client_runtime func=Verbose.Infof msg="stop requested"
ts=2023-07-04T08:35:05.636Z caller=klog.go:55 level=debug component=k8s_client_runtime func=Verbose.Infof msg="Listing and watching *v1.Service from pkg/mod/k8s.io/client-go@v0.26.2/tools/cache/reflector.go:169"
ts=2023-07-04T08:35:05.637Z caller=klog.go:55 level=debug component=k8s_client_runtime func=Verbose.Infof msg="Starting reflector *v1.Service (0s) from pkg/mod/k8s.io/client-go@v0.26.2/tools/cache/reflector.go:169"
ts=2023-07-04T08:35:05.637Z caller=klog.go:55 level=debug component=k8s_client_runtime func=Verbose.Infof msg="Listing and watching *v1.Service from pkg/mod/k8s.io/client-go@v0.26.2/tools/cache/reflector.go:169"
ts=2023-07-04T08:35:05.637Z caller=klog.go:55 level=debug component=k8s_client_runtime func=Verbose.Infof msg="Starting reflector *v1.Endpoints (0s) from pkg/mod/k8s.io/client-go@v0.26.2/tools/cache/reflector.go:169"
ts=2023-07-04T08:35:05.637Z caller=klog.go:55 level=debug component=k8s_client_runtime func=Verbose.Infof msg="Starting reflector *v1.Pod (0s) from pkg/mod/k8s.io/client-go@v0.26.2/tools/cache/reflector.go:169"
ts=2023-07-04T08:35:05.637Z caller=klog.go:55 level=debug component=k8s_client_runtime func=Verbose.Infof msg="Listing and watching *v1.Endpoints from pkg/mod/k8s.io/client-go@v0.26.2/tools/cache/reflector.go:169"
ts=2023-07-04T08:35:05.637Z caller=klog.go:55 level=debug component=k8s_client_runtime func=Verbose.Infof msg="Listing and watching *v1.Pod from pkg/mod/k8s.io/client-go@v0.26.2/tools/cache/reflector.go:169"
ts=2023-07-04T08:35:05.638Z caller=klog.go:84 level=debug component=k8s_client_runtime func=Infof msg="GET https://10.101.0.1:443/api/v1/namespaces/default/pods?limit=500&resourceVersion=0 200 OK in 1 milliseconds"
ts=2023-07-04T08:35:05.638Z caller=klog.go:84 level=debug component=k8s_client_runtime func=Infof msg="GET https://10.101.0.1:443/api/v1/namespaces/default/endpoints?limit=500&resourceVersion=0 200 OK in 1 milliseconds"
ts=2023-07-04T08:35:05.639Z caller=klog.go:84 level=debug component=k8s_client_runtime func=Infof msg="GET https://10.101.0.1:443/api/v1/namespaces/aml-monitoring/endpoints?limit=500&resourceVersion=0 200 OK in 1 milliseconds"
ts=2023-07-04T08:35:05.639Z caller=klog.go:84 level=debug component=k8s_client_runtime func=Infof msg="GET https://10.101.0.1:443/api/v1/namespaces/aml-monitoring/services?limit=500&resourceVersion=0 200 OK in 2 milliseconds"
ts=2023-07-04T08:35:05.641Z caller=klog.go:84 level=debug component=k8s_client_runtime func=Infof msg="GET https://10.101.0.1:443/api/v1/namespaces/aml-monitoring/pods?limit=500&resourceVersion=0 200 OK in 4 milliseconds"
ts=2023-07-04T08:35:05.641Z caller=klog.go:84 level=debug component=k8s_client_runtime func=Infof msg="GET https://10.101.0.1:443/api/v1/namespaces/aml-monitoring/endpoints?limit=500&resourceVersion=0 200 OK in 3 milliseconds"
ts=2023-07-04T08:35:05.641Z caller=klog.go:84 level=debug component=k8s_client_runtime func=Infof msg="GET https://10.101.0.1:443/api/v1/namespaces/default/services?limit=500&resourceVersion=0 200 OK in 4 milliseconds"
ts=2023-07-04T08:35:05.641Z caller=klog.go:84 level=debug component=k8s_client_runtime func=Infof msg="GET https://10.101.0.1:443/api/v1/namespaces/kube-system/services?limit=500&resourceVersion=0 200 OK in 4 milliseconds"
ts=2023-07-04T08:35:05.641Z caller=klog.go:84 level=debug component=k8s_client_runtime func=Infof msg="GET https://10.101.0.1:443/api/v1/namespaces/kube-system/endpoints?limit=500&resourceVersion=0 200 OK in 4 milliseconds"
ts=2023-07-04T08:35:05.641Z caller=klog.go:84 level=debug component=k8s_client_runtime func=Infof msg="GET https://10.101.0.1:443/api/v1/namespaces/aml-monitoring/services?limit=500&resourceVersion=0 200 OK in 3 milliseconds"
ts=2023-07-04T08:35:05.642Z caller=klog.go:84 level=debug component=k8s_client_runtime func=Infof msg="GET https://10.101.0.1:443/api/v1/namespaces/kube-system/pods?limit=500&resourceVersion=0 200 OK in 5 milliseconds"
ts=2023-07-04T08:35:05.645Z caller=klog.go:84 level=debug component=k8s_client_runtime func=Infof msg="GET https://10.101.0.1:443/api/v1/namespaces/default/services?allowWatchBookmarks=true&resourceVersion=106603667&timeout=5m19s&timeoutSeconds=319&watch=true 200 OK in 3 milliseconds"
ts=2023-07-04T08:35:05.645Z caller=klog.go:84 level=debug component=k8s_client_runtime func=Infof msg="GET https://10.101.0.1:443/api/v1/namespaces/default/pods?allowWatchBookmarks=true&resourceVersion=106603657&timeout=5m33s&timeoutSeconds=333&watch=true 200 OK in 4 milliseconds"
ts=2023-07-04T08:35:05.645Z caller=klog.go:84 level=debug component=k8s_client_runtime func=Infof msg="GET https://10.101.0.1:443/api/v1/namespaces/aml-monitoring/pods?limit=500&resourceVersion=0 200 OK in 8 milliseconds"
ts=2023-07-04T08:35:05.645Z caller=klog.go:84 level=debug component=k8s_client_runtime func=Infof msg="GET https://10.101.0.1:443/api/v1/namespaces/kube-system/services?allowWatchBookmarks=true&resourceVersion=106603667&timeout=7m11s&timeoutSeconds=431&watch=true 200 OK in 3 milliseconds"
ts=2023-07-04T08:35:05.645Z caller=klog.go:84 level=debug component=k8s_client_runtime func=Infof msg="GET https://10.101.0.1:443/api/v1/namespaces/kube-system/endpoints?allowWatchBookmarks=true&resourceVersion=106603650&timeout=6m51s&timeoutSeconds=411&watch=true 200 OK in 3 milliseconds"
ts=2023-07-04T08:35:05.645Z caller=klog.go:84 level=debug component=k8s_client_runtime func=Infof msg="GET https://10.101.0.1:443/api/v1/namespaces/aml-monitoring/services?allowWatchBookmarks=true&resourceVersion=106603667&timeout=9m12s&timeoutSeconds=552&watch=true 200 OK in 3 milliseconds"
ts=2023-07-04T08:35:05.645Z caller=klog.go:84 level=debug component=k8s_client_runtime func=Infof msg="GET https://10.101.0.1:443/api/v1/namespaces/aml-monitoring/endpoints?allowWatchBookmarks=true&resourceVersion=106603650&timeout=8m16s&timeoutSeconds=496&watch=true 200 OK in 4 milliseconds"
ts=2023-07-04T08:35:05.647Z caller=klog.go:84 level=debug component=k8s_client_runtime func=Infof msg="GET https://10.101.0.1:443/api/v1/namespaces/default/endpoints?allowWatchBookmarks=true&resourceVersion=106603650&timeout=5m33s&timeoutSeconds=333&watch=true 200 OK in 2 milliseconds"
ts=2023-07-04T08:35:05.647Z caller=klog.go:84 level=debug component=k8s_client_runtime func=Infof msg="GET https://10.101.0.1:443/api/v1/namespaces/aml-monitoring/pods?allowWatchBookmarks=true&resourceVersion=106603657&timeout=8m5s&timeoutSeconds=485&watch=true 200 OK in 3 milliseconds"
ts=2023-07-04T08:35:05.647Z caller=klog.go:84 level=debug component=k8s_client_runtime func=Infof msg="GET https://10.101.0.1:443/api/v1/namespaces/aml-monitoring/services?allowWatchBookmarks=true&resourceVersion=106603667&timeout=8m44s&timeoutSeconds=524&watch=true 200 OK in 1 milliseconds"
ts=2023-07-04T08:35:05.647Z caller=klog.go:84 level=debug component=k8s_client_runtime func=Infof msg="GET https://10.101.0.1:443/api/v1/namespaces/aml-monitoring/endpoints?allowWatchBookmarks=true&resourceVersion=106603650&timeout=5m41s&timeoutSeconds=341&watch=true 200 OK in 1 milliseconds"
ts=2023-07-04T08:35:05.649Z caller=klog.go:84 level=debug component=k8s_client_runtime func=Infof msg="GET https://10.101.0.1:443/api/v1/namespaces/aml-monitoring/pods?allowWatchBookmarks=true&resourceVersion=106603657&timeout=5m18s&timeoutSeconds=318&watch=true 200 OK in 0 milliseconds"
ts=2023-07-04T08:35:05.650Z caller=klog.go:84 level=debug component=k8s_client_runtime func=Infof msg="GET https://10.101.0.1:443/api/v1/namespaces/kube-system/pods?allowWatchBookmarks=true&resourceVersion=106603657&timeout=9m37s&timeoutSeconds=577&watch=true 200 OK in 0 milliseconds"
ts=2023-07-04T08:35:05.689Z caller=main.go:1257 level=info msg="Completed loading of configuration file" filename=/etc/prometheus/config_out/prometheus.env.yaml totalDuration=56.346054ms db_storage=970ns remote_storage=1.03µs web_handler=420ns query_engine=630ns scrape=45.721µs scrape_sd=739.92µs notify=16.15µs notify_sd=363.8µs rules=52.377037ms tracing=6.21µs
ts=2023-07-04T08:35:05.737Z caller=klog.go:55 level=debug component=k8s_client_runtime func=Verbose.Infof msg="caches populated"
ts=2023-07-04T08:35:05.737Z caller=klog.go:55 level=debug component=k8s_client_runtime func=Verbose.Infof msg="caches populated"
ts=2023-07-04T08:35:05.737Z caller=klog.go:55 level=debug component=k8s_client_runtime func=Verbose.Infof msg="caches populated"
ts=2023-07-04T08:35:05.737Z caller=klog.go:55 level=debug component=k8s_client_runtime func=Verbose.Infof msg="caches populated"

What's your helm version?

version.BuildInfo{Version:"v3.12.1", GitCommit:"f32a527a060157990e2aa86bf45010dfb3cc8b8d", GitTreeState:"clean", GoVersion:"go1.20.4"}

What's your kubectl version?

Client Version: version.Info{Major:"1", Minor:"27", GitVersion:"v1.27.2", GitCommit:"7f6f68fdabc4df88cfea2dcf9a19b2b830f1e647", GitTreeState:"clean", BuildDate:"2023-05-17T14:20:07Z", GoVersion:"go1.20.4", Compiler:"gc", Platform:"linux/amd64"} Kustomize Version: v5.0.1

Which chart?

kube-promethus-stack

What's the chart version?

47.0.0

What happened?

Prometheus is ignoring the configuration to scan all namespaces in the EKS cluster

What you expected to happen?

I'm expecting Prometheus to recognise these config options:

prometheus:
    prometheusSpec:
      podMonitorSelectorNilUsesHelmValues: false
      serviceMonitorSelectorNilUsesHelmValues: false

How to reproduce it?

No response

Enter the changed values of values.yaml?

No response

Enter the command that you execute and failing/misfunctioning.

helm upgrade --install -n aml-monitoring aml-monitoring ./ -f values.yaml

Anything else we need to know?

No response

johnswarbrick-napier commented 1 year ago

I tried explicitly setting the namespaces to discover using:

    namespaces:
      releaseNamespace: true
      additional:
      - aml
      - ingress-nginx
      - kube-system

This has been correctly set in the operator deployment YAML:

    spec:
      containers:
      - args:
        - --kubelet-service=kube-system/aml-monitoring-kube-promet-kubelet
        - --log-level=debug
        - --namespaces=aml-monitoring,aml,ingress-nginx,kube-system
        - --localhost=127.0.0.1
        - --prometheus-config-reloader=napier.azurecr.io/prometheus-config-reloader:v0.66.0
        - --config-reloader-cpu-request=200m
        - --config-reloader-cpu-limit=200m
        - --config-reloader-memory-request=50Mi
        - --config-reloader-memory-limit=50Mi
        - --thanos-default-base-image=napier.azurecr.io/thanos/thanos:v0.31.0
        - --secret-field-selector=type!=kubernetes.io/dockercfg,type!=kubernetes.io/service-account-token,type!=helm.sh/release.v1

...but Prometheus is still only querying the Kubernetes API for default, kube-system and the installation namespace - it is ignoring the explicitly set namespaces of aml and ingress-nginx:

ts=2023-07-04T09:00:15.657Z caller=klog.go:55 level=debug component=k8s_client_runtime func=Verbose.Infof msg="pkg/mod/k8s.io/client-go@v0.26.2/tools/cache/reflector.go:169: Watch close - *v1.Endpoints total 8 items received"
ts=2023-07-04T09:00:15.659Z caller=klog.go:84 level=debug component=k8s_client_runtime func=Infof msg="GET https://10.101.0.1:443/api/v1/namespaces/kube-system/endpoints?allowWatchBookmarks=true&resourceVersion=106613346&timeout=6m50s&timeoutSeconds=410&watch=true 200 OK in 1 milliseconds"
ts=2023-07-04T09:01:04.654Z caller=klog.go:55 level=debug component=k8s_client_runtime func=Verbose.Infof msg="pkg/mod/k8s.io/client-go@v0.26.2/tools/cache/reflector.go:169: Watch close - *v1.Endpoints total 12 items received"
ts=2023-07-04T09:01:04.657Z caller=klog.go:84 level=debug component=k8s_client_runtime func=Infof msg="GET https://10.101.0.1:443/api/v1/namespaces/aml-monitoring/endpoints?allowWatchBookmarks=true&resourceVersion=106613675&timeout=9m2s&timeoutSeconds=542&watch=true 200 OK in 2 milliseconds"
ts=2023-07-04T09:01:55.663Z caller=klog.go:55 level=debug component=k8s_client_runtime func=Verbose.Infof msg="pkg/mod/k8s.io/client-go@v0.26.2/tools/cache/reflector.go:169: Watch close - *v1.Pod total 8 items received"
ts=2023-07-04T09:01:55.668Z caller=klog.go:84 level=debug component=k8s_client_runtime func=Infof msg="GET https://10.101.0.1:443/api/v1/namespaces/aml-monitoring/pods?allowWatchBookmarks=true&resourceVersion=106613994&timeout=8m47s&timeoutSeconds=527&watch=true 200 OK in 4 milliseconds"
ts=2023-07-04T09:02:13.660Z caller=klog.go:55 level=debug component=k8s_client_runtime func=Verbose.Infof msg="pkg/mod/k8s.io/client-go@v0.26.2/tools/cache/reflector.go:169: Watch close - *v1.Endpoints total 8 items received"
ts=2023-07-04T09:02:13.662Z caller=klog.go:84 level=debug component=k8s_client_runtime func=Infof msg="GET https://10.101.0.1:443/api/v1/namespaces/aml-monitoring/endpoints?allowWatchBookmarks=true&resourceVersion=106614105&timeout=8m55s&timeoutSeconds=535&watch=true 200 OK in 2 milliseconds"
ts=2023-07-04T09:04:26.662Z caller=klog.go:55 level=debug component=k8s_client_runtime func=Verbose.Infof msg="pkg/mod/k8s.io/client-go@v0.26.2/tools/cache/reflector.go:169: Watch close - *v1.Pod total 7 items received"
ts=2023-07-04T09:04:26.665Z caller=klog.go:84 level=debug component=k8s_client_runtime func=Infof msg="GET https://10.101.0.1:443/api/v1/namespaces/default/pods?allowWatchBookmarks=true&resourceVersion=106614950&timeout=6m10s&timeoutSeconds=370&watch=true 200 OK in 3 milliseconds"
ts=2023-07-04T09:04:33.657Z caller=klog.go:55 level=debug component=k8s_client_runtime func=Verbose.Infof msg="pkg/mod/k8s.io/client-go@v0.26.2/tools/cache/reflector.go:169: Watch close - *v1.Service total 10 items received"
ts=2023-07-04T09:04:33.660Z caller=klog.go:84 level=debug component=k8s_client_runtime func=Infof msg="GET https://10.101.0.1:443/api/v1/namespaces/default/services?allowWatchBookmarks=true&resourceVersion=106614994&timeout=7m1s&timeoutSeconds=421&watch=true 200 OK in 3 milliseconds"
ts=2023-07-04T09:04:50.663Z caller=klog.go:55 level=debug component=k8s_client_runtime func=Verbose.Infof msg="pkg/mod/k8s.io/client-go@v0.26.2/tools/cache/reflector.go:169: Watch close - *v1.Service total 7 items received"
ts=2023-07-04T09:04:50.666Z caller=klog.go:84 level=debug component=k8s_client_runtime func=Infof msg="GET https://10.101.0.1:443/api/v1/namespaces/aml-monitoring/services?allowWatchBookmarks=true&resourceVersion=106615093&timeout=9m40s&timeoutSeconds=580&watch=true 200 OK in 1 milliseconds"
ts=2023-07-04T09:06:13.663Z caller=klog.go:55 level=debug component=k8s_client_runtime func=Verbose.Infof msg="pkg/mod/k8s.io/client-go@v0.26.2/tools/cache/reflector.go:169: Watch close - *v1.Pod total 7 items received"
ts=2023-07-04T09:06:13.667Z caller=klog.go:84 level=debug component=k8s_client_runtime func=Infof msg="GET https://10.101.0.1:443/api/v1/namespaces/kube-system/pods?allowWatchBookmarks=true&resourceVersion=106615658&timeout=8m45s&timeoutSeconds=525&watch=true 200 OK in 4 milliseconds"
ts=2023-07-04T09:06:15.663Z caller=klog.go:55 level=debug component=k8s_client_runtime func=Verbose.Infof msg="pkg/mod/k8s.io/client-go@v0.26.2/tools/cache/reflector.go:169: Watch close - *v1.Endpoints total 8 items received"
ts=2023-07-04T09:06:15.665Z caller=klog.go:84 level=debug component=k8s_client_runtime func=Infof msg="GET https://10.101.0.1:443/api/v1/namespaces/default/endpoints?allowWatchBookmarks=true&resourceVersion=106615666&timeout=9m27s&timeoutSeconds=567&watch=true 200 OK in 1 milliseconds"
ts=2023-07-04T09:07:05.660Z caller=klog.go:55 level=debug component=k8s_client_runtime func=Verbose.Infof msg="pkg/mod/k8s.io/client-go@v0.26.2/tools/cache/reflector.go:169: Watch close - *v1.Endpoints total 8 items received"
ts=2023-07-04T09:07:05.663Z caller=klog.go:84 level=debug component=k8s_client_runtime func=Infof msg="GET https://10.101.0.1:443/api/v1/namespaces/kube-system/endpoints?allowWatchBookmarks=true&resourceVersion=106615983&timeout=5m53s&timeoutSeconds=353&watch=true 200 OK in 3 milliseconds"
ts=2023-07-04T09:07:33.661Z caller=klog.go:55 level=debug component=k8s_client_runtime func=Verbose.Infof msg="pkg/mod/k8s.io/client-go@v0.26.2/tools/cache/reflector.go:169: Watch close - *v1.Pod total 10 items received"
ts=2023-07-04T09:07:33.666Z caller=klog.go:84 level=debug component=k8s_client_runtime func=Infof msg="GET https://10.101.0.1:443/api/v1/namespaces/aml-monitoring/pods?allowWatchBookmarks=true&resourceVersion=106616160&timeout=7m8s&timeoutSeconds=428&watch=true 200 OK in 4 milliseconds"
ts=2023-07-04T09:07:45.659Z caller=klog.go:55 level=debug component=k8s_client_runtime func=Verbose.Infof msg="pkg/mod/k8s.io/client-go@v0.26.2/tools/cache/reflector.go:169: Watch close - *v1.Service total 9 items received"
ts=2023-07-04T09:07:45.660Z caller=klog.go:84 level=debug component=k8s_client_runtime func=Infof msg="GET https://10.101.0.1:443/api/v1/namespaces/kube-system/services?allowWatchBookmarks=true&resourceVersion=106616253&timeout=8m29s&timeoutSeconds=509&watch=true 200 OK in 1 milliseconds"
johnswarbrick-napier commented 1 year ago

I made a small amount of progress - the Prometheus Operator is aware of other namespaces now - although Prometheus still isn't scraping from namespaces other than default, kube-system and [install_namespace].

The same chart versions are working perfectly on Azure with discovery happening across all namespaces, so maybe this is an AWS EKS specific issue, but I've been working on it for two days solid and still can't get it working!

Prometheus Operator logs showing all namespaces selected:

level=debug ts=2023-07-04T18:35:57.037842081Z caller=resource_selector.go:93 component=prometheusoperator msg="filtering namespaces to select ServiceMonitors from" namespaces=default,aml-monitoring,kube-public,tailscale,kube-node-lease,kube-system,aml,ingress-nginx,test-aml namespace=aml-monitoring prometheus=aml-monitoring-kube-promet
level=debug ts=2023-07-04T18:35:57.042876561Z caller=resource_selector.go:334 component=prometheusoperator msg="filtering namespaces to select PodMonitors from" namespaces=kube-node-lease,kube-system,default,aml-monitoring,kube-public,tailscale,ingress-nginx,test-aml,aml namespace=aml-monitoring prometheus=aml-monitoring-kube-promet
level=debug ts=2023-07-04T18:35:57.042927182Z caller=resource_selector.go:464 component=prometheusoperator msg="filtering namespaces to select Probes from" namespaces=kube-node-lease,kube-system,default,aml-monitoring,kube-public,tailscale,ingress nginx,test-aml,aml namespace=aml-monitoring prometheus=aml-monitoring-kube-promet
level=debug ts=2023-07-04T18:35:57.042972863Z caller=resource_selector.go:611 component=prometheusoperator msg="filtering namespaces to select ScrapeConfigs from" namespaces=kube-node-lease,kube-system,default,aml-monitoring,kube-public,tailscale,ingress-nginx,test-aml,aml namespace=aml-monitoring prometheus=aml-monitoring-kube-promet

Prometheus logs showing that it's only making Kubernetes API queries for default, kube-system and [install_namespace] namespaces:

ts=2023-07-04T18:41:24.846Z caller=klog.go:84 level=debug component=k8s_client_runtime func=Infof msg="GET https://10.101.0.1:443/api/v1/namespaces/kube-system/pods?allowWatchBookmarks=true&resourceVersion=106838227&timeout=7m48s&timeoutSeconds=468&watch=true 200 OK in 5 milliseconds"
ts=2023-07-04T18:41:58.842Z caller=klog.go:84 level=debug component=k8s_client_runtime func=Infof msg="GET https://10.101.0.1:443/api/v1/namespaces/aml-monitoring/pods?allowWatchBookmarks=true&resourceVersion=106838480&timeout=5m50s&timeoutSeconds=350&watch=true 200 OK in 2 milliseconds"
ts=2023-07-04T18:42:12.841Z caller=klog.go:84 level=debug component=k8s_client_runtime func=Infof msg="GET https://10.101.0.1:443/api/v1/namespaces/default/endpoints?allowWatchBookmarks=true&resourceVersion=106838559&timeout=9m51s&timeoutSeconds=591&watch=true 200 OK in 3 milliseconds"
ts=2023-07-04T18:42:59.838Z caller=klog.go:84 level=debug component=k8s_client_runtime func=Infof msg="GET https://10.101.0.1:443/api/v1/namespaces/kube-system/endpoints?allowWatchBookmarks=true&resourceVersion=106838893&timeout=9m5s&timeoutSeconds=545&watch=true 200 OK in 4 milliseconds"
ts=2023-07-04T18:43:52.843Z caller=klog.go:84 level=debug component=k8s_client_runtime func=Infof msg="GET https://10.101.0.1:443/api/v1/namespaces/aml-monitoring/services?allowWatchBookmarks=true&resourceVersion=106839244&timeout=6m25s&timeoutSeconds=385&watch=true 200 OK in 1 milliseconds"
ts=2023-07-04T18:44:10.847Z caller=klog.go:84 level=debug component=k8s_client_runtime func=Infof msg="GET https://10.101.0.1:443/api/v1/namespaces/aml-monitoring/endpoints?allowWatchBookmarks=true&resourceVersion=106839367&timeout=8m3s&timeoutSeconds=483&watch=true 200 OK in 5 milliseconds"
ts=2023-07-04T18:45:29.839Z caller=klog.go:84 level=debug component=k8s_client_runtime func=Infof msg="GET https://10.101.0.1:443/api/v1/namespaces/aml-monitoring/services?allowWatchBookmarks=true&resourceVersion=106839851&timeout=7m39s&timeoutSeconds=459&watch=true 200 OK in 3 milliseconds"
ts=2023-07-04T18:46:19.839Z caller=klog.go:84 level=debug component=k8s_client_runtime func=Infof msg="GET https://10.101.0.1:443/api/v1/namespaces/default/pods?allowWatchBookmarks=true&resourceVersion=106840192&timeout=7m11s&timeoutSeconds=431&watch=true 200 OK in 1 milliseconds"
ts=2023-07-04T18:46:47.844Z caller=klog.go:84 level=debug component=k8s_client_runtime func=Infof msg="GET https://10.101.0.1:443/api/v1/namespaces/default/services?allowWatchBookmarks=true&resourceVersion=106840345&timeout=7m0s&timeoutSeconds=420&watch=true 200 OK in 1 milliseconds"
ts=2023-07-04T18:47:48.845Z caller=klog.go:84 level=debug component=k8s_client_runtime func=Infof msg="GET https://10.101.0.1:443/api/v1/namespaces/aml-monitoring/pods?allowWatchBookmarks=true&resourceVersion=106840758&timeout=9m33s&timeoutSeconds=573&watch=true 200 OK in 2 milliseconds"
ts=2023-07-04T18:47:51.852Z caller=klog.go:84 level=debug component=k8s_client_runtime func=Infof msg="GET https://10.101.0.1:443/api/v1/namespaces/aml-monitoring/endpoints?allowWatchBookmarks=true&resourceVersion=106840782&timeout=6m25s&timeoutSeconds=385&watch=true 200 OK in 2 milliseconds"
ts=2023-07-04T18:48:05.845Z caller=klog.go:84 level=debug component=k8s_client_runtime func=Infof msg="GET https://10.101.0.1:443/api/v1/namespaces/kube-system/services?allowWatchBookmarks=true&resourceVersion=106840847&timeout=8m57s&timeoutSeconds=537&watch=true 200 OK in 3 milliseconds"
ts=2023-07-04T18:48:15.840Z caller=klog.go:84 level=debug component=k8s_client_runtime func=Infof msg="GET https://10.101.0.1:443/api/v1/namespaces/aml-monitoring/pods?allowWatchBookmarks=true&resourceVersion=106840921&timeout=7m18s&timeoutSeconds=438&watch=true 200 OK in 1 milliseconds"
ts=2023-07-04T18:49:12.849Z caller=klog.go:84 level=debug component=k8s_client_runtime func=Infof msg="GET https://10.101.0.1:443/api/v1/namespaces/kube-system/pods?allowWatchBookmarks=true&resourceVersion=106841291&timeout=9m41s&timeoutSeconds=581&watch=true 200 OK in 2 milliseconds"

This is the contents of the prometheus.yaml.gz from the secret prometheus-aml-monitoring-kube-promet:

global:
  evaluation_interval: 30s
  scrape_interval: 30s
  external_labels:
    prometheus: aml-monitoring/aml-monitoring-kube-promet
    prometheus_replica: $(POD_NAME)
rule_files:
- /etc/prometheus/rules/prometheus-aml-monitoring-kube-promet-rulefiles-0/*.yaml
scrape_configs:
- job_name: serviceMonitor/aml-monitoring/aml-monitoring-kube-promet-alertmanager/0
  honor_labels: false
  kubernetes_sd_configs:
  - role: endpoints
    namespaces:
      names:
      - aml-monitoring
  metrics_path: /metrics
  enable_http2: true
  relabel_configs:
  - source_labels:
    - job
    target_label: __tmp_prometheus_job_name
  - action: keep
    source_labels:
    - __meta_kubernetes_service_label_app
    - __meta_kubernetes_service_labelpresent_app
    regex: (kube-prometheus-stack-alertmanager);true
  - action: keep
    source_labels:
    - __meta_kubernetes_service_label_release
    - __meta_kubernetes_service_labelpresent_release
    regex: (aml-monitoring);true
  - action: keep
    source_labels:
    - __meta_kubernetes_service_label_self_monitor
    - __meta_kubernetes_service_labelpresent_self_monitor
    regex: (true);true
  - action: keep
    source_labels:
    - __meta_kubernetes_endpoint_port_name
    regex: http-web
  - source_labels:
    - __meta_kubernetes_endpoint_address_target_kind
    - __meta_kubernetes_endpoint_address_target_name
    separator: ;
    regex: Node;(.*)
    replacement: ${1}
    target_label: node
  - source_labels:
    - __meta_kubernetes_endpoint_address_target_kind
    - __meta_kubernetes_endpoint_address_target_name
    separator: ;
    regex: Pod;(.*)
    replacement: ${1}
    target_label: pod
  - source_labels:
    - __meta_kubernetes_namespace
    target_label: namespace
  - source_labels:
    - __meta_kubernetes_service_name
    target_label: service
  - source_labels:
    - __meta_kubernetes_pod_name
    target_label: pod
  - source_labels:
    - __meta_kubernetes_pod_container_name
    target_label: container
  - action: drop
    source_labels:
    - __meta_kubernetes_pod_phase
    regex: (Failed|Succeeded)
  - source_labels:
    - __meta_kubernetes_service_name
    target_label: job
    replacement: ${1}
  - target_label: endpoint
    replacement: http-web
  - source_labels:
    - __address__
    target_label: __tmp_hash
    modulus: 1
    action: hashmod
  - source_labels:
    - __tmp_hash
    regex: $(SHARD)
    action: keep
  metric_relabel_configs: []
- job_name: serviceMonitor/aml-monitoring/aml-monitoring-kube-promet-apiserver/0
  honor_labels: false
  kubernetes_sd_configs:
  - role: endpoints
    namespaces:
      names:
      - default
  scheme: https
  tls_config:
    insecure_skip_verify: false
    server_name: kubernetes
    ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
  bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
  relabel_configs:
  - source_labels:
    - job
    target_label: __tmp_prometheus_job_name
  - action: keep
    source_labels:
    - __meta_kubernetes_service_label_component
    - __meta_kubernetes_service_labelpresent_component
    regex: (apiserver);true
  - action: keep
    source_labels:
    - __meta_kubernetes_service_label_provider
    - __meta_kubernetes_service_labelpresent_provider
    regex: (kubernetes);true
  - action: keep
    source_labels:
    - __meta_kubernetes_endpoint_port_name
    regex: https
  - source_labels:
    - __meta_kubernetes_endpoint_address_target_kind
    - __meta_kubernetes_endpoint_address_target_name
    separator: ;
    regex: Node;(.*)
    replacement: ${1}
    target_label: node
  - source_labels:
    - __meta_kubernetes_endpoint_address_target_kind
    - __meta_kubernetes_endpoint_address_target_name
    separator: ;
    regex: Pod;(.*)
    replacement: ${1}
    target_label: pod
  - source_labels:
    - __meta_kubernetes_namespace
    target_label: namespace
  - source_labels:
    - __meta_kubernetes_service_name
    target_label: service
  - source_labels:
    - __meta_kubernetes_pod_name
    target_label: pod
  - source_labels:
    - __meta_kubernetes_pod_container_name
    target_label: container
  - action: drop
    source_labels:
    - __meta_kubernetes_pod_phase
    regex: (Failed|Succeeded)
  - source_labels:
    - __meta_kubernetes_service_name
    target_label: job
    replacement: ${1}
  - source_labels:
    - __meta_kubernetes_service_label_component
    target_label: job
    regex: (.+)
    replacement: ${1}
  - target_label: endpoint
    replacement: https
  - source_labels:
    - __address__
    target_label: __tmp_hash
    modulus: 1
    action: hashmod
  - source_labels:
    - __tmp_hash
    regex: $(SHARD)
    action: keep
  metric_relabel_configs:
  - source_labels:
    - __name__
    - le
    regex: apiserver_request_duration_seconds_bucket;(0.15|0.2|0.3|0.35|0.4|0.45|0.6|0.7|0.8|0.9|1.25|1.5|1.75|2|3|3.5|4|4.5|6|7|8|9|15|25|40|50)
    action: drop
- job_name: serviceMonitor/aml-monitoring/aml-monitoring-kube-promet-coredns/0
  honor_labels: false
  kubernetes_sd_configs:
  - role: endpoints
    namespaces:
      names:
      - kube-system
  bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
  relabel_configs:
  - source_labels:
    - job
    target_label: __tmp_prometheus_job_name
  - action: keep
    source_labels:
    - __meta_kubernetes_service_label_app
    - __meta_kubernetes_service_labelpresent_app
    regex: (kube-prometheus-stack-coredns);true
  - action: keep
    source_labels:
    - __meta_kubernetes_service_label_release
    - __meta_kubernetes_service_labelpresent_release
    regex: (aml-monitoring);true
  - action: keep
    source_labels:
    - __meta_kubernetes_endpoint_port_name
    regex: http-metrics
  - source_labels:
    - __meta_kubernetes_endpoint_address_target_kind
    - __meta_kubernetes_endpoint_address_target_name
    separator: ;
    regex: Node;(.*)
    replacement: ${1}
    target_label: node
  - source_labels:
    - __meta_kubernetes_endpoint_address_target_kind
    - __meta_kubernetes_endpoint_address_target_name
    separator: ;
    regex: Pod;(.*)
    replacement: ${1}
    target_label: pod
  - source_labels:
    - __meta_kubernetes_namespace
    target_label: namespace
  - source_labels:
    - __meta_kubernetes_service_name
    target_label: service
  - source_labels:
    - __meta_kubernetes_pod_name
    target_label: pod
  - source_labels:
    - __meta_kubernetes_pod_container_name
    target_label: container
  - action: drop
    source_labels:
    - __meta_kubernetes_pod_phase
    regex: (Failed|Succeeded)
  - source_labels:
    - __meta_kubernetes_service_name
    target_label: job
    replacement: ${1}
  - source_labels:
    - __meta_kubernetes_service_label_jobLabel
    target_label: job
    regex: (.+)
    replacement: ${1}
  - target_label: endpoint
    replacement: http-metrics
  - source_labels:
    - __address__
    target_label: __tmp_hash
    modulus: 1
    action: hashmod
  - source_labels:
    - __tmp_hash
    regex: $(SHARD)
    action: keep
  metric_relabel_configs: []
- job_name: serviceMonitor/aml-monitoring/aml-monitoring-kube-promet-kube-etcd/0
  honor_labels: false
  kubernetes_sd_configs:
  - role: endpoints
    namespaces:
      names:
      - kube-system
  bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
  relabel_configs:
  - source_labels:
    - job
    target_label: __tmp_prometheus_job_name
  - action: keep
    source_labels:
    - __meta_kubernetes_service_label_app
    - __meta_kubernetes_service_labelpresent_app
    regex: (kube-prometheus-stack-kube-etcd);true
  - action: keep
    source_labels:
    - __meta_kubernetes_service_label_release
    - __meta_kubernetes_service_labelpresent_release
    regex: (aml-monitoring);true
  - action: keep
    source_labels:
    - __meta_kubernetes_endpoint_port_name
    regex: http-metrics
  - source_labels:
    - __meta_kubernetes_endpoint_address_target_kind
    - __meta_kubernetes_endpoint_address_target_name
    separator: ;
    regex: Node;(.*)
    replacement: ${1}
    target_label: node
  - source_labels:
    - __meta_kubernetes_endpoint_address_target_kind
    - __meta_kubernetes_endpoint_address_target_name
    separator: ;
    regex: Pod;(.*)
    replacement: ${1}
    target_label: pod
  - source_labels:
    - __meta_kubernetes_namespace
    target_label: namespace
  - source_labels:
    - __meta_kubernetes_service_name
    target_label: service
  - source_labels:
    - __meta_kubernetes_pod_name
    target_label: pod
  - source_labels:
    - __meta_kubernetes_pod_container_name
    target_label: container
  - action: drop
    source_labels:
    - __meta_kubernetes_pod_phase
    regex: (Failed|Succeeded)
  - source_labels:
    - __meta_kubernetes_service_name
    target_label: job
    replacement: ${1}
  - source_labels:
    - __meta_kubernetes_service_label_jobLabel
    target_label: job
    regex: (.+)
    replacement: ${1}
  - target_label: endpoint
    replacement: http-metrics
  - source_labels:
    - __address__
    target_label: __tmp_hash
    modulus: 1
    action: hashmod
  - source_labels:
    - __tmp_hash
    regex: $(SHARD)
    action: keep
  metric_relabel_configs: []
- job_name: serviceMonitor/aml-monitoring/aml-monitoring-kube-promet-kubelet/0
  honor_labels: true
  kubernetes_sd_configs:
  - role: endpoints
    namespaces:
      names:
      - kube-system
  scheme: https
  tls_config:
    insecure_skip_verify: true
    ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
  bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
  relabel_configs:
  - source_labels:
    - job
    target_label: __tmp_prometheus_job_name
  - action: keep
    source_labels:
    - __meta_kubernetes_service_label_app_kubernetes_io_name
    - __meta_kubernetes_service_labelpresent_app_kubernetes_io_name
    regex: (kubelet);true
  - action: keep
    source_labels:
    - __meta_kubernetes_service_label_k8s_app
    - __meta_kubernetes_service_labelpresent_k8s_app
    regex: (kubelet);true
  - action: keep
    source_labels:
    - __meta_kubernetes_endpoint_port_name
    regex: https-metrics
  - source_labels:
    - __meta_kubernetes_endpoint_address_target_kind
    - __meta_kubernetes_endpoint_address_target_name
    separator: ;
    regex: Node;(.*)
    replacement: ${1}
    target_label: node
  - source_labels:
    - __meta_kubernetes_endpoint_address_target_kind
    - __meta_kubernetes_endpoint_address_target_name
    separator: ;
    regex: Pod;(.*)
    replacement: ${1}
    target_label: pod
  - source_labels:
    - __meta_kubernetes_namespace
    target_label: namespace
  - source_labels:
    - __meta_kubernetes_service_name
    target_label: service
  - source_labels:
    - __meta_kubernetes_pod_name
    target_label: pod
  - source_labels:
    - __meta_kubernetes_pod_container_name
    target_label: container
  - action: drop
    source_labels:
    - __meta_kubernetes_pod_phase
    regex: (Failed|Succeeded)
  - source_labels:
    - __meta_kubernetes_service_name
    target_label: job
    replacement: ${1}
  - source_labels:
    - __meta_kubernetes_service_label_k8s_app
    target_label: job
    regex: (.+)
    replacement: ${1}
  - target_label: endpoint
    replacement: https-metrics
  - source_labels:
    - __metrics_path__
    target_label: metrics_path
    action: replace
  - source_labels:
    - __address__
    target_label: __tmp_hash
    modulus: 1
    action: hashmod
  - source_labels:
    - __tmp_hash
    regex: $(SHARD)
    action: keep
  metric_relabel_configs: []
- job_name: serviceMonitor/aml-monitoring/aml-monitoring-kube-promet-kubelet/1
  honor_labels: true
  kubernetes_sd_configs:
  - role: endpoints
    namespaces:
      names:
      - kube-system
  metrics_path: /metrics/cadvisor
  scheme: https
  tls_config:
    insecure_skip_verify: true
    ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
  bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
  relabel_configs:
  - source_labels:
    - job
    target_label: __tmp_prometheus_job_name
  - action: keep
    source_labels:
    - __meta_kubernetes_service_label_app_kubernetes_io_name
    - __meta_kubernetes_service_labelpresent_app_kubernetes_io_name
    regex: (kubelet);true
  - action: keep
    source_labels:
    - __meta_kubernetes_service_label_k8s_app
    - __meta_kubernetes_service_labelpresent_k8s_app
    regex: (kubelet);true
  - action: keep
    source_labels:
    - __meta_kubernetes_endpoint_port_name
    regex: https-metrics
  - source_labels:
    - __meta_kubernetes_endpoint_address_target_kind
    - __meta_kubernetes_endpoint_address_target_name
    separator: ;
    regex: Node;(.*)
    replacement: ${1}
    target_label: node
  - source_labels:
    - __meta_kubernetes_endpoint_address_target_kind
    - __meta_kubernetes_endpoint_address_target_name
    separator: ;
    regex: Pod;(.*)
    replacement: ${1}
    target_label: pod
  - source_labels:
    - __meta_kubernetes_namespace
    target_label: namespace
  - source_labels:
    - __meta_kubernetes_service_name
    target_label: service
  - source_labels:
    - __meta_kubernetes_pod_name
    target_label: pod
  - source_labels:
    - __meta_kubernetes_pod_container_name
    target_label: container
  - action: drop
    source_labels:
    - __meta_kubernetes_pod_phase
    regex: (Failed|Succeeded)
  - source_labels:
    - __meta_kubernetes_service_name
    target_label: job
    replacement: ${1}
  - source_labels:
    - __meta_kubernetes_service_label_k8s_app
    target_label: job
    regex: (.+)
    replacement: ${1}
  - target_label: endpoint
    replacement: https-metrics
  - source_labels:
    - __metrics_path__
    target_label: metrics_path
    action: replace
  - source_labels:
    - __address__
    target_label: __tmp_hash
    modulus: 1
    action: hashmod
  - source_labels:
    - __tmp_hash
    regex: $(SHARD)
    action: keep
  metric_relabel_configs:
  - source_labels:
    - __name__
    regex: container_cpu_(cfs_throttled_seconds_total|load_average_10s|system_seconds_total|user_seconds_total)
    action: drop
  - source_labels:
    - __name__
    regex: container_fs_(io_current|io_time_seconds_total|io_time_weighted_seconds_total|reads_merged_total|sector_reads_total|sector_writes_total|writes_merged_total)
    action: drop
  - source_labels:
    - __name__
    regex: container_memory_(mapped_file|swap)
    action: drop
  - source_labels:
    - __name__
    regex: container_(file_descriptors|tasks_state|threads_max)
    action: drop
  - source_labels:
    - __name__
    regex: container_spec.*
    action: drop
  - source_labels:
    - id
    - pod
    regex: .+;
    action: drop
- job_name: serviceMonitor/aml-monitoring/aml-monitoring-kube-promet-kubelet/2
  honor_labels: true
  kubernetes_sd_configs:
  - role: endpoints
    namespaces:
      names:
      - kube-system
  metrics_path: /metrics/probes
  scheme: https
  tls_config:
    insecure_skip_verify: true
    ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
  bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
  relabel_configs:
  - source_labels:
    - job
    target_label: __tmp_prometheus_job_name
  - action: keep
    source_labels:
    - __meta_kubernetes_service_label_app_kubernetes_io_name
    - __meta_kubernetes_service_labelpresent_app_kubernetes_io_name
    regex: (kubelet);true
  - action: keep
    source_labels:
    - __meta_kubernetes_service_label_k8s_app
    - __meta_kubernetes_service_labelpresent_k8s_app
    regex: (kubelet);true
  - action: keep
    source_labels:
    - __meta_kubernetes_endpoint_port_name
    regex: https-metrics
  - source_labels:
    - __meta_kubernetes_endpoint_address_target_kind
    - __meta_kubernetes_endpoint_address_target_name
    separator: ;
    regex: Node;(.*)
    replacement: ${1}
    target_label: node
  - source_labels:
    - __meta_kubernetes_endpoint_address_target_kind
    - __meta_kubernetes_endpoint_address_target_name
    separator: ;
    regex: Pod;(.*)
    replacement: ${1}
    target_label: pod
  - source_labels:
    - __meta_kubernetes_namespace
    target_label: namespace
  - source_labels:
    - __meta_kubernetes_service_name
    target_label: service
  - source_labels:
    - __meta_kubernetes_pod_name
    target_label: pod
  - source_labels:
    - __meta_kubernetes_pod_container_name
    target_label: container
  - action: drop
    source_labels:
    - __meta_kubernetes_pod_phase
    regex: (Failed|Succeeded)
  - source_labels:
    - __meta_kubernetes_service_name
    target_label: job
    replacement: ${1}
  - source_labels:
    - __meta_kubernetes_service_label_k8s_app
    target_label: job
    regex: (.+)
    replacement: ${1}
  - target_label: endpoint
    replacement: https-metrics
  - source_labels:
    - __metrics_path__
    target_label: metrics_path
    action: replace
  - source_labels:
    - __address__
    target_label: __tmp_hash
    modulus: 1
    action: hashmod
  - source_labels:
    - __tmp_hash
    regex: $(SHARD)
    action: keep
  metric_relabel_configs: []
- job_name: serviceMonitor/aml-monitoring/aml-monitoring-kube-promet-operator/0
  honor_labels: true
  kubernetes_sd_configs:
  - role: endpoints
    namespaces:
      names:
      - aml-monitoring
  scheme: https
  tls_config:
    insecure_skip_verify: false
    ca_file: /etc/prometheus/certs/secret_aml-monitoring_aml-monitoring-kube-promet-admission_ca
    server_name: aml-monitoring-kube-promet-operator
  relabel_configs:
  - source_labels:
    - job
    target_label: __tmp_prometheus_job_name
  - action: keep
    source_labels:
    - __meta_kubernetes_service_label_app
    - __meta_kubernetes_service_labelpresent_app
    regex: (kube-prometheus-stack-operator);true
  - action: keep
    source_labels:
    - __meta_kubernetes_service_label_release
    - __meta_kubernetes_service_labelpresent_release
    regex: (aml-monitoring);true
  - action: keep
    source_labels:
    - __meta_kubernetes_endpoint_port_name
    regex: https
  - source_labels:
    - __meta_kubernetes_endpoint_address_target_kind
    - __meta_kubernetes_endpoint_address_target_name
    separator: ;
    regex: Node;(.*)
    replacement: ${1}
    target_label: node
  - source_labels:
    - __meta_kubernetes_endpoint_address_target_kind
    - __meta_kubernetes_endpoint_address_target_name
    separator: ;
    regex: Pod;(.*)
    replacement: ${1}
    target_label: pod
  - source_labels:
    - __meta_kubernetes_namespace
    target_label: namespace
  - source_labels:
    - __meta_kubernetes_service_name
    target_label: service
  - source_labels:
    - __meta_kubernetes_pod_name
    target_label: pod
  - source_labels:
    - __meta_kubernetes_pod_container_name
    target_label: container
  - action: drop
    source_labels:
    - __meta_kubernetes_pod_phase
    regex: (Failed|Succeeded)
  - source_labels:
    - __meta_kubernetes_service_name
    target_label: job
    replacement: ${1}
  - target_label: endpoint
    replacement: https
  - source_labels:
    - __address__
    target_label: __tmp_hash
    modulus: 1
    action: hashmod
  - source_labels:
    - __tmp_hash
    regex: $(SHARD)
    action: keep
  metric_relabel_configs: []
- job_name: serviceMonitor/aml-monitoring/aml-monitoring-kube-promet-prometheus/0
  honor_labels: false
  kubernetes_sd_configs:
  - role: endpoints
    namespaces:
      names:
      - aml-monitoring
  metrics_path: /metrics
  relabel_configs:
  - source_labels:
    - job
    target_label: __tmp_prometheus_job_name
  - action: keep
    source_labels:
    - __meta_kubernetes_service_label_app
    - __meta_kubernetes_service_labelpresent_app
    regex: (kube-prometheus-stack-prometheus);true
  - action: keep
    source_labels:
    - __meta_kubernetes_service_label_release
    - __meta_kubernetes_service_labelpresent_release
    regex: (aml-monitoring);true
  - action: keep
    source_labels:
    - __meta_kubernetes_service_label_self_monitor
    - __meta_kubernetes_service_labelpresent_self_monitor
    regex: (true);true
  - action: keep
    source_labels:
    - __meta_kubernetes_endpoint_port_name
    regex: http-web
  - source_labels:
    - __meta_kubernetes_endpoint_address_target_kind
    - __meta_kubernetes_endpoint_address_target_name
    separator: ;
    regex: Node;(.*)
    replacement: ${1}
    target_label: node
  - source_labels:
    - __meta_kubernetes_endpoint_address_target_kind
    - __meta_kubernetes_endpoint_address_target_name
    separator: ;
    regex: Pod;(.*)
    replacement: ${1}
    target_label: pod
  - source_labels:
    - __meta_kubernetes_namespace
    target_label: namespace
  - source_labels:
    - __meta_kubernetes_service_name
    target_label: service
  - source_labels:
    - __meta_kubernetes_pod_name
    target_label: pod
  - source_labels:
    - __meta_kubernetes_pod_container_name
    target_label: container
  - action: drop
    source_labels:
    - __meta_kubernetes_pod_phase
    regex: (Failed|Succeeded)
  - source_labels:
    - __meta_kubernetes_service_name
    target_label: job
    replacement: ${1}
  - target_label: endpoint
    replacement: http-web
  - source_labels:
    - __address__
    target_label: __tmp_hash
    modulus: 1
    action: hashmod
  - source_labels:
    - __tmp_hash
    regex: $(SHARD)
    action: keep
  metric_relabel_configs: []
- job_name: serviceMonitor/aml-monitoring/aml-monitoring-kube-state-metrics/0
  honor_labels: true
  kubernetes_sd_configs:
  - role: endpoints
    namespaces:
      names:
      - aml-monitoring
  relabel_configs:
  - source_labels:
    - job
    target_label: __tmp_prometheus_job_name
  - action: keep
    source_labels:
    - __meta_kubernetes_service_label_app_kubernetes_io_instance
    - __meta_kubernetes_service_labelpresent_app_kubernetes_io_instance
    regex: (aml-monitoring);true
  - action: keep
    source_labels:
    - __meta_kubernetes_service_label_app_kubernetes_io_name
    - __meta_kubernetes_service_labelpresent_app_kubernetes_io_name
    regex: (kube-state-metrics);true
  - action: keep
    source_labels:
    - __meta_kubernetes_endpoint_port_name
    regex: http
  - source_labels:
    - __meta_kubernetes_endpoint_address_target_kind
    - __meta_kubernetes_endpoint_address_target_name
    separator: ;
    regex: Node;(.*)
    replacement: ${1}
    target_label: node
  - source_labels:
    - __meta_kubernetes_endpoint_address_target_kind
    - __meta_kubernetes_endpoint_address_target_name
    separator: ;
    regex: Pod;(.*)
    replacement: ${1}
    target_label: pod
  - source_labels:
    - __meta_kubernetes_namespace
    target_label: namespace
  - source_labels:
    - __meta_kubernetes_service_name
    target_label: service
  - source_labels:
    - __meta_kubernetes_pod_name
    target_label: pod
  - source_labels:
    - __meta_kubernetes_pod_container_name
    target_label: container
  - action: drop
    source_labels:
    - __meta_kubernetes_pod_phase
    regex: (Failed|Succeeded)
  - source_labels:
    - __meta_kubernetes_service_name
    target_label: job
    replacement: ${1}
  - source_labels:
    - __meta_kubernetes_service_label_app_kubernetes_io_name
    target_label: job
    regex: (.+)
    replacement: ${1}
  - target_label: endpoint
    replacement: http
  - source_labels:
    - __address__
    target_label: __tmp_hash
    modulus: 1
    action: hashmod
  - source_labels:
    - __tmp_hash
    regex: $(SHARD)
    action: keep
  metric_relabel_configs: []
- job_name: serviceMonitor/aml-monitoring/aml-monitoring-loki/0
  honor_labels: false
  kubernetes_sd_configs:
  - role: endpoints
    namespaces:
      names:
      - aml-monitoring
  scrape_interval: 15s
  metrics_path: /metrics
  scheme: http
  relabel_configs:
  - source_labels:
    - job
    target_label: __tmp_prometheus_job_name
  - action: keep
    source_labels:
    - __meta_kubernetes_service_label_app_kubernetes_io_instance
    - __meta_kubernetes_service_labelpresent_app_kubernetes_io_instance
    regex: (aml-monitoring);true
  - action: keep
    source_labels:
    - __meta_kubernetes_service_label_app_kubernetes_io_name
    - __meta_kubernetes_service_labelpresent_app_kubernetes_io_name
    regex: (loki);true
  - action: drop
    source_labels:
    - __meta_kubernetes_service_label_prometheus_io_service_monitor
    - __meta_kubernetes_service_labelpresent_prometheus_io_service_monitor
    regex: (false);true
  - action: keep
    source_labels:
    - __meta_kubernetes_endpoint_port_name
    regex: http-metrics
  - source_labels:
    - __meta_kubernetes_endpoint_address_target_kind
    - __meta_kubernetes_endpoint_address_target_name
    separator: ;
    regex: Node;(.*)
    replacement: ${1}
    target_label: node
  - source_labels:
    - __meta_kubernetes_endpoint_address_target_kind
    - __meta_kubernetes_endpoint_address_target_name
    separator: ;
    regex: Pod;(.*)
    replacement: ${1}
    target_label: pod
  - source_labels:
    - __meta_kubernetes_namespace
    target_label: namespace
  - source_labels:
    - __meta_kubernetes_service_name
    target_label: service
  - source_labels:
    - __meta_kubernetes_pod_name
    target_label: pod
  - source_labels:
    - __meta_kubernetes_pod_container_name
    target_label: container
  - action: drop
    source_labels:
    - __meta_kubernetes_pod_phase
    regex: (Failed|Succeeded)
  - source_labels:
    - __meta_kubernetes_service_name
    target_label: job
    replacement: ${1}
  - target_label: endpoint
    replacement: http-metrics
  - source_labels:
    - job
    target_label: job
    replacement: aml-monitoring/$1
    action: replace
  - target_label: cluster
    replacement: aml-monitoring-loki
    action: replace
  - source_labels:
    - __address__
    target_label: __tmp_hash
    modulus: 1
    action: hashmod
  - source_labels:
    - __tmp_hash
    regex: $(SHARD)
    action: keep
  metric_relabel_configs: []
- job_name: serviceMonitor/aml-monitoring/aml-monitoring-prometheus-node-exporter/0
  honor_labels: false
  kubernetes_sd_configs:
  - role: endpoints
    namespaces:
      names:
      - aml-monitoring
    attach_metadata:
      node: false
  scheme: http
  relabel_configs:
  - source_labels:
    - job
    target_label: __tmp_prometheus_job_name
  - action: keep
    source_labels:
    - __meta_kubernetes_service_label_app_kubernetes_io_instance
    - __meta_kubernetes_service_labelpresent_app_kubernetes_io_instance
    regex: (aml-monitoring);true
  - action: keep
    source_labels:
    - __meta_kubernetes_service_label_app_kubernetes_io_name
    - __meta_kubernetes_service_labelpresent_app_kubernetes_io_name
    regex: (prometheus-node-exporter);true
  - action: keep
    source_labels:
    - __meta_kubernetes_endpoint_port_name
    regex: http-metrics
  - source_labels:
    - __meta_kubernetes_endpoint_address_target_kind
    - __meta_kubernetes_endpoint_address_target_name
    separator: ;
    regex: Node;(.*)
    replacement: ${1}
    target_label: node
  - source_labels:
    - __meta_kubernetes_endpoint_address_target_kind
    - __meta_kubernetes_endpoint_address_target_name
    separator: ;
    regex: Pod;(.*)
    replacement: ${1}
    target_label: pod
  - source_labels:
    - __meta_kubernetes_namespace
    target_label: namespace
  - source_labels:
    - __meta_kubernetes_service_name
    target_label: service
  - source_labels:
    - __meta_kubernetes_pod_name
    target_label: pod
  - source_labels:
    - __meta_kubernetes_pod_container_name
    target_label: container
  - action: drop
    source_labels:
    - __meta_kubernetes_pod_phase
    regex: (Failed|Succeeded)
  - source_labels:
    - __meta_kubernetes_service_name
    target_label: job
    replacement: ${1}
  - source_labels:
    - __meta_kubernetes_service_label_jobLabel
    target_label: job
    regex: (.+)
    replacement: ${1}
  - target_label: endpoint
    replacement: http-metrics
  - source_labels:
    - __address__
    target_label: __tmp_hash
    modulus: 1
    action: hashmod
  - source_labels:
    - __tmp_hash
    regex: $(SHARD)
    action: keep
  metric_relabel_configs: []
storage:
  tsdb:
    out_of_order_time_window: 0s
alerting:
  alert_relabel_configs:
  - action: labeldrop
    regex: prometheus_replica
  alertmanagers:
  - path_prefix: /
    scheme: http
    kubernetes_sd_configs:
    - role: endpoints
      namespaces:
        names:
        - aml-monitoring
    api_version: v2
    relabel_configs:
    - action: keep
      source_labels:
      - __meta_kubernetes_service_name
      regex: aml-monitoring-kube-promet-alertmanager
    - action: keep
      source_labels:
      - __meta_kubernetes_endpoint_port_name
      regex: http-web
johnswarbrick-napier commented 1 year ago

I think the problem is that namespace filtering is broken.

kube-prometheus-stack is installed into namespace aml-monitoring.

The following config options are set, where I'm explicitly forcing scaping of namespaces aml and tailscale:

  prometheusOperator:
    namespaces:
      releaseNamespace: true
      additional:
        - aml
        - tailscale
  prometheus:
    prometheusSpec:
      podMonitorSelectorNilUsesHelmValues: false
      serviceMonitorSelectorNilUsesHelmValues: false

The Prometheus Operator logs indicate that it sees all namespaces during filtering, including the target aml and tailscale:

level=debug ts=2023-07-04T22:14:54.695699749Z caller=resource_selector.go:93 component=prometheusoperator msg="filtering namespaces to select ServiceMonitors from" namespaces=kube-public,kube-system,default,ingress-nginx,test-aml,aml,aml-monitoring,kube-node-lease,tailscale namespace=aml-monitoring prometheus=aml-monitoring-kube-promet

level=debug ts=2023-07-04T22:14:54.702177763Z caller=resource_selector.go:334 component=prometheusoperator msg="filtering namespaces to select PodMonitors from" namespaces=tailscale,aml,aml-monitoring,kube-node-lease,test-aml,kube-public,kube-system,default,ingress-nginx namespace=aml-monitoring prometheus=aml-monitoring-kube-promet

...but it then only ever selects the installation namespace, aml-monitoring - it completely ignores the target namespaces:

level=debug ts=2023-07-04T22:14:54.702152062Z caller=resource_selector.go:191 component=prometheusoperator msg="selected ServiceMonitors" servicemonitors=aml-monitoring/aml-monitoring-kube-promet-prometheus,aml-monitoring/aml-monitoring-kube-state-metrics,aml-monitoring/aml-monitoring-loki,aml-monitoring/aml-monitoring-kube-promet-operator,aml-monitoring/aml-monitoring-kube-promet-coredns,aml-monitoring/aml-monitoring-kube-promet-kubelet,aml-monitoring/aml-monitoring-kube-promet-kube-etcd,aml-monitoring/aml-monitoring-kube-promet-apiserver,aml-monitoring/aml-monitoring-prometheus-node-exporter,aml-monitoring/aml-monitoring-kube-promet-alertmanager namespace=aml-monitoring prometheus=aml-monitoring-kube-promet

level=debug ts=2023-07-04T22:14:54.702200793Z caller=resource_selector.go:424 component=prometheusoperator msg="selected PodMonitors" podmonitors= namespace=aml-monitoring prometheus=aml-monitoring-kube-promet

This results in the Prometheus Operator never adding podMonitors and serviceMonitors from those additional namespaces to the prometheus.yaml.gz in the Prometheus secret, therefore the podMonitors and serviceMonitors are never scraped by Prometheus.

The same config appears to work fine in Azure. Maybe this is an EKS permissions issue with a KubeAPI query silently failing and not being logged? I cannot find any access denied or other errors anywhere in the cluster though.

@cccsss01 - I think you experienced the same in https://github.com/prometheus-community/helm-charts/issues/3410

@sebastianlutter - https://github.com/prometheus-community/helm-charts/issues/3487 seems exactly the same - how did you fix this?

johnswarbrick-napier commented 1 year ago

Looks like https://github.com/prometheus-community/helm-charts/issues/2323 is similar.

@arpitjindal97 and @arpitjindal97 - what am I missing here??

I modified my configuration to:

kube-prometheus-stack:
  prometheusOperator:
    namespaces:
      releaseNamespace: true
      additional:
        - ingress-nginx
        - aml
        - kube-system
  prometheus:
    prometheusSpec:
      podMonitorSelectorNilUsesHelmValues: false
      podMonitorNamespaceSelector:
        matchExpressions:
          - key: kubernetes.io/metadata.name
            operator: Exists
      serviceMonitorSelectorNilUsesHelmValues: false
      serviceMonitorSelector: {}
      serviceMonitorNamespaceSelector:
        matchExpressions:
          - key: kubernetes.io/metadata.name
            operator: Exists

...to match the output of kubectl get ns/aml -o json

{
    "apiVersion": "v1",
    "kind": "Namespace",
    "metadata": {
        "creationTimestamp": "2023-02-01T11:48:14Z",
        "labels": {
            "kubernetes.io/metadata.name": "aml"
        },
        "name": "aml",
        "resourceVersion": "16264136",
        "uid": "6173158f-3202-4fc4-bebc-d7b2ea4ba36a"
    },
    "spec": {
        "finalizers": [
            "kubernetes"
        ]
    },
    "status": {
        "phase": "Active"
    }
}

But it's still not picking up any podmonitors in namespaces outside the installation namespace:

level=debug ts=2023-07-05T03:22:22.492509284Z caller=resource_selector.go:93 component=prometheusoperator msg="filtering namespaces to select ServiceMonitors from" namespaces=aml,kube-system,aml-monitoring,ingress-nginx namespace=aml-monitoring prometheus=aml-monitoring-kube-promet-prometheus

level=debug ts=2023-07-05T03:22:22.496242835Z caller=klog.go:84 component=k8s_client_runtime func=Infof msg="GET https://10.101.0.1:443/api/v1/namespaces/aml-monitoring/secrets/aml-monitoring-kube-promet-admission 200 OK in 3 milliseconds"

level=debug ts=2023-07-05T03:22:22.497143299Z caller=resource_selector.go:191 component=prometheusoperator msg="selected ServiceMonitors" servicemonitors=aml-monitoring/aml-monitoring-kube-promet-kube-etcd,aml-monitoring/aml-monitoring-kube-promet-coredns,aml-monitoring/aml-monitoring-kube-promet-operator,aml-monitoring/aml-monitoring-kube-state-metrics,aml-monitoring/aml-monitoring-kube-promet-kubelet,aml-monitoring/aml-monitoring-loki,aml-monitoring/aml-monitoring-kube-promet-alertmanager,aml-monitoring/aml-monitoring-kube-promet-prometheus,aml-monitoring/aml-monitoring-prometheus-node-exporter,aml-monitoring/aml-monitoring-kube-promet-apiserver namespace=aml-monitoring prometheus=aml-monitoring-kube-promet-prometheus

level=debug ts=2023-07-05T03:22:22.497199411Z caller=resource_selector.go:334 component=prometheusoperator msg="filtering namespaces to select PodMonitors from" namespaces=kube-system,aml-monitoring,ingress-nginx,aml namespace=aml-monitoring prometheus=aml-monitoring-kube-promet-prometheus
level=debug ts=2023-07-05T03:22:22.497229962Z caller=resource_selector.go:424 component=prometheusoperator msg="selected PodMonitors" podmonitors= namespace=aml-monitoring prometheus=aml-monitoring-kube-promet-prometheus

The Prometheus config is unchanged and only has the installation namespace:

global:
  evaluation_interval: 30s
  scrape_interval: 30s
  external_labels:
    prometheus: aml-monitoring/aml-monitoring-kube-promet-prometheus
    prometheus_replica: $(POD_NAME)
rule_files:
- /etc/prometheus/rules/prometheus-aml-monitoring-kube-promet-prometheus-rulefiles-0/*.yaml
scrape_configs:
- job_name: serviceMonitor/aml-monitoring/aml-monitoring-kube-promet-alertmanager/0
  honor_labels: false
  kubernetes_sd_configs:
  - role: endpoints
    namespaces:
      names:
      - aml-monitoring
  metrics_path: /metrics
  enable_http2: true
  relabel_configs:
  - source_labels:
    - job
    target_label: __tmp_prometheus_job_name
  - action: keep
    source_labels:
    - __meta_kubernetes_service_label_app
    - __meta_kubernetes_service_labelpresent_app
    regex: (kube-prometheus-stack-alertmanager);true
  - action: keep
    source_labels:
    - __meta_kubernetes_service_label_release
    - __meta_kubernetes_service_labelpresent_release
    regex: (aml-monitoring);true
  - action: keep
    source_labels:
    - __meta_kubernetes_service_label_self_monitor
    - __meta_kubernetes_service_labelpresent_self_monitor
    regex: (true);true
  - action: keep
    source_labels:
    - __meta_kubernetes_endpoint_port_name
    regex: http-web
  - source_labels:
    - __meta_kubernetes_endpoint_address_target_kind
    - __meta_kubernetes_endpoint_address_target_name
    separator: ;
    regex: Node;(.*)
    replacement: ${1}
    target_label: node
  - source_labels:
    - __meta_kubernetes_endpoint_address_target_kind
    - __meta_kubernetes_endpoint_address_target_name
    separator: ;
    regex: Pod;(.*)
    replacement: ${1}
    target_label: pod
  - source_labels:
    - __meta_kubernetes_namespace
    target_label: namespace
  - source_labels:
    - __meta_kubernetes_service_name
    target_label: service
  - source_labels:
    - __meta_kubernetes_pod_name
    target_label: pod
  - source_labels:
    - __meta_kubernetes_pod_container_name
    target_label: container
  - action: drop
    source_labels:
    - __meta_kubernetes_pod_phase
    regex: (Failed|Succeeded)
  - source_labels:
    - __meta_kubernetes_service_name
    target_label: job
    replacement: ${1}
  - target_label: endpoint
    replacement: http-web
  - source_labels:
    - __address__
    target_label: __tmp_hash
    modulus: 1
    action: hashmod
  - source_labels:
    - __tmp_hash
    regex: $(SHARD)
    action: keep
  metric_relabel_configs: []
- job_name: serviceMonitor/aml-monitoring/aml-monitoring-kube-promet-alertmanager/1
  honor_labels: false
  kubernetes_sd_configs:
  - role: endpoints
    namespaces:
      names:
      - aml-monitoring
  metrics_path: /metrics
  relabel_configs:
  - source_labels:
    - job
    target_label: __tmp_prometheus_job_name
  - action: keep
    source_labels:
    - __meta_kubernetes_service_label_app
    - __meta_kubernetes_service_labelpresent_app
    regex: (kube-prometheus-stack-alertmanager);true
  - action: keep
    source_labels:
    - __meta_kubernetes_service_label_release
    - __meta_kubernetes_service_labelpresent_release
    regex: (aml-monitoring);true
  - action: keep
    source_labels:
    - __meta_kubernetes_service_label_self_monitor
    - __meta_kubernetes_service_labelpresent_self_monitor
    regex: (true);true
  - action: keep
    source_labels:
    - __meta_kubernetes_endpoint_port_name
    regex: reloader-web
  - source_labels:
    - __meta_kubernetes_endpoint_address_target_kind
    - __meta_kubernetes_endpoint_address_target_name
    separator: ;
    regex: Node;(.*)
    replacement: ${1}
    target_label: node
  - source_labels:
    - __meta_kubernetes_endpoint_address_target_kind
    - __meta_kubernetes_endpoint_address_target_name
    separator: ;
    regex: Pod;(.*)
    replacement: ${1}
    target_label: pod
  - source_labels:
    - __meta_kubernetes_namespace
    target_label: namespace
  - source_labels:
    - __meta_kubernetes_service_name
    target_label: service
  - source_labels:
    - __meta_kubernetes_pod_name
    target_label: pod
  - source_labels:
    - __meta_kubernetes_pod_container_name
    target_label: container
  - action: drop
    source_labels:
    - __meta_kubernetes_pod_phase
    regex: (Failed|Succeeded)
  - source_labels:
    - __meta_kubernetes_service_name
    target_label: job
    replacement: ${1}
  - target_label: endpoint
    replacement: reloader-web
  - source_labels:
    - __address__
    target_label: __tmp_hash
    modulus: 1
    action: hashmod
  - source_labels:
    - __tmp_hash
    regex: $(SHARD)
    action: keep
  metric_relabel_configs: []
- job_name: serviceMonitor/aml-monitoring/aml-monitoring-kube-promet-apiserver/0
  honor_labels: false
  kubernetes_sd_configs:
  - role: endpoints
    namespaces:
      names:
      - default
  scheme: https
  tls_config:
    insecure_skip_verify: false
    server_name: kubernetes
    ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
  bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
  relabel_configs:
  - source_labels:
    - job
    target_label: __tmp_prometheus_job_name
  - action: keep
    source_labels:
    - __meta_kubernetes_service_label_component
    - __meta_kubernetes_service_labelpresent_component
    regex: (apiserver);true
  - action: keep
    source_labels:
    - __meta_kubernetes_service_label_provider
    - __meta_kubernetes_service_labelpresent_provider
    regex: (kubernetes);true
  - action: keep
    source_labels:
    - __meta_kubernetes_endpoint_port_name
    regex: https
  - source_labels:
    - __meta_kubernetes_endpoint_address_target_kind
    - __meta_kubernetes_endpoint_address_target_name
    separator: ;
    regex: Node;(.*)
    replacement: ${1}
    target_label: node
  - source_labels:
    - __meta_kubernetes_endpoint_address_target_kind
    - __meta_kubernetes_endpoint_address_target_name
    separator: ;
    regex: Pod;(.*)
    replacement: ${1}
    target_label: pod
  - source_labels:
    - __meta_kubernetes_namespace
    target_label: namespace
  - source_labels:
    - __meta_kubernetes_service_name
    target_label: service
  - source_labels:
    - __meta_kubernetes_pod_name
    target_label: pod
  - source_labels:
    - __meta_kubernetes_pod_container_name
    target_label: container
  - action: drop
    source_labels:
    - __meta_kubernetes_pod_phase
    regex: (Failed|Succeeded)
  - source_labels:
    - __meta_kubernetes_service_name
    target_label: job
    replacement: ${1}
  - source_labels:
    - __meta_kubernetes_service_label_component
    target_label: job
    regex: (.+)
    replacement: ${1}
  - target_label: endpoint
    replacement: https
  - source_labels:
    - __address__
    target_label: __tmp_hash
    modulus: 1
    action: hashmod
  - source_labels:
    - __tmp_hash
    regex: $(SHARD)
    action: keep
  metric_relabel_configs:
  - source_labels:
    - __name__
    - le
    regex: apiserver_request_duration_seconds_bucket;(0.15|0.2|0.3|0.35|0.4|0.45|0.6|0.7|0.8|0.9|1.25|1.5|1.75|2|3|3.5|4|4.5|6|7|8|9|15|25|40|50)
    action: drop
- job_name: serviceMonitor/aml-monitoring/aml-monitoring-kube-promet-coredns/0
  honor_labels: false
  kubernetes_sd_configs:
  - role: endpoints
    namespaces:
      names:
      - kube-system
  bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
  relabel_configs:
  - source_labels:
    - job
    target_label: __tmp_prometheus_job_name
  - action: keep
    source_labels:
    - __meta_kubernetes_service_label_app
    - __meta_kubernetes_service_labelpresent_app
    regex: (kube-prometheus-stack-coredns);true
  - action: keep
    source_labels:
    - __meta_kubernetes_service_label_release
    - __meta_kubernetes_service_labelpresent_release
    regex: (aml-monitoring);true
  - action: keep
    source_labels:
    - __meta_kubernetes_endpoint_port_name
    regex: http-metrics
  - source_labels:
    - __meta_kubernetes_endpoint_address_target_kind
    - __meta_kubernetes_endpoint_address_target_name
    separator: ;
    regex: Node;(.*)
    replacement: ${1}
    target_label: node
  - source_labels:
    - __meta_kubernetes_endpoint_address_target_kind
    - __meta_kubernetes_endpoint_address_target_name
    separator: ;
    regex: Pod;(.*)
    replacement: ${1}
    target_label: pod
  - source_labels:
    - __meta_kubernetes_namespace
    target_label: namespace
  - source_labels:
    - __meta_kubernetes_service_name
    target_label: service
  - source_labels:
    - __meta_kubernetes_pod_name
    target_label: pod
  - source_labels:
    - __meta_kubernetes_pod_container_name
    target_label: container
  - action: drop
    source_labels:
    - __meta_kubernetes_pod_phase
    regex: (Failed|Succeeded)
  - source_labels:
    - __meta_kubernetes_service_name
    target_label: job
    replacement: ${1}
  - source_labels:
    - __meta_kubernetes_service_label_jobLabel
    target_label: job
    regex: (.+)
    replacement: ${1}
  - target_label: endpoint
    replacement: http-metrics
  - source_labels:
    - __address__
    target_label: __tmp_hash
    modulus: 1
    action: hashmod
  - source_labels:
    - __tmp_hash
    regex: $(SHARD)
    action: keep
  metric_relabel_configs: []
- job_name: serviceMonitor/aml-monitoring/aml-monitoring-kube-promet-kube-etcd/0
  honor_labels: false
  kubernetes_sd_configs:
  - role: endpoints
    namespaces:
      names:
      - kube-system
  bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
  relabel_configs:
  - source_labels:
    - job
    target_label: __tmp_prometheus_job_name
  - action: keep
    source_labels:
    - __meta_kubernetes_service_label_app
    - __meta_kubernetes_service_labelpresent_app
    regex: (kube-prometheus-stack-kube-etcd);true
  - action: keep
    source_labels:
    - __meta_kubernetes_service_label_release
    - __meta_kubernetes_service_labelpresent_release
    regex: (aml-monitoring);true
  - action: keep
    source_labels:
    - __meta_kubernetes_endpoint_port_name
    regex: http-metrics
  - source_labels:
    - __meta_kubernetes_endpoint_address_target_kind
    - __meta_kubernetes_endpoint_address_target_name
    separator: ;
    regex: Node;(.*)
    replacement: ${1}
    target_label: node
  - source_labels:
    - __meta_kubernetes_endpoint_address_target_kind
    - __meta_kubernetes_endpoint_address_target_name
    separator: ;
    regex: Pod;(.*)
    replacement: ${1}
    target_label: pod
  - source_labels:
    - __meta_kubernetes_namespace
    target_label: namespace
  - source_labels:
    - __meta_kubernetes_service_name
    target_label: service
  - source_labels:
    - __meta_kubernetes_pod_name
    target_label: pod
  - source_labels:
    - __meta_kubernetes_pod_container_name
    target_label: container
  - action: drop
    source_labels:
    - __meta_kubernetes_pod_phase
    regex: (Failed|Succeeded)
  - source_labels:
    - __meta_kubernetes_service_name
    target_label: job
    replacement: ${1}
  - source_labels:
    - __meta_kubernetes_service_label_jobLabel
    target_label: job
    regex: (.+)
    replacement: ${1}
  - target_label: endpoint
    replacement: http-metrics
  - source_labels:
    - __address__
    target_label: __tmp_hash
    modulus: 1
    action: hashmod
  - source_labels:
    - __tmp_hash
    regex: $(SHARD)
    action: keep
  metric_relabel_configs: []
- job_name: serviceMonitor/aml-monitoring/aml-monitoring-kube-promet-kubelet/0
  honor_labels: true
  kubernetes_sd_configs:
  - role: endpoints
    namespaces:
      names:
      - kube-system
  scheme: https
  tls_config:
    insecure_skip_verify: true
    ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
  bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
  relabel_configs:
  - source_labels:
    - job
    target_label: __tmp_prometheus_job_name
  - action: keep
    source_labels:
    - __meta_kubernetes_service_label_app_kubernetes_io_name
    - __meta_kubernetes_service_labelpresent_app_kubernetes_io_name
    regex: (kubelet);true
  - action: keep
    source_labels:
    - __meta_kubernetes_service_label_k8s_app
    - __meta_kubernetes_service_labelpresent_k8s_app
    regex: (kubelet);true
  - action: keep
    source_labels:
    - __meta_kubernetes_endpoint_port_name
    regex: https-metrics
  - source_labels:
    - __meta_kubernetes_endpoint_address_target_kind
    - __meta_kubernetes_endpoint_address_target_name
    separator: ;
    regex: Node;(.*)
    replacement: ${1}
    target_label: node
  - source_labels:
    - __meta_kubernetes_endpoint_address_target_kind
    - __meta_kubernetes_endpoint_address_target_name
    separator: ;
    regex: Pod;(.*)
    replacement: ${1}
    target_label: pod
  - source_labels:
    - __meta_kubernetes_namespace
    target_label: namespace
  - source_labels:
    - __meta_kubernetes_service_name
    target_label: service
  - source_labels:
    - __meta_kubernetes_pod_name
    target_label: pod
  - source_labels:
    - __meta_kubernetes_pod_container_name
    target_label: container
  - action: drop
    source_labels:
    - __meta_kubernetes_pod_phase
    regex: (Failed|Succeeded)
  - source_labels:
    - __meta_kubernetes_service_name
    target_label: job
    replacement: ${1}
  - source_labels:
    - __meta_kubernetes_service_label_k8s_app
    target_label: job
    regex: (.+)
    replacement: ${1}
  - target_label: endpoint
    replacement: https-metrics
  - source_labels:
    - __metrics_path__
    target_label: metrics_path
    action: replace
  - source_labels:
    - __address__
    target_label: __tmp_hash
    modulus: 1
    action: hashmod
  - source_labels:
    - __tmp_hash
    regex: $(SHARD)
    action: keep
  metric_relabel_configs: []
- job_name: serviceMonitor/aml-monitoring/aml-monitoring-kube-promet-kubelet/1
  honor_labels: true
  kubernetes_sd_configs:
  - role: endpoints
    namespaces:
      names:
      - kube-system
  metrics_path: /metrics/cadvisor
  scheme: https
  tls_config:
    insecure_skip_verify: true
    ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
  bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
  relabel_configs:
  - source_labels:
    - job
    target_label: __tmp_prometheus_job_name
  - action: keep
    source_labels:
    - __meta_kubernetes_service_label_app_kubernetes_io_name
    - __meta_kubernetes_service_labelpresent_app_kubernetes_io_name
    regex: (kubelet);true
  - action: keep
    source_labels:
    - __meta_kubernetes_service_label_k8s_app
    - __meta_kubernetes_service_labelpresent_k8s_app
    regex: (kubelet);true
  - action: keep
    source_labels:
    - __meta_kubernetes_endpoint_port_name
    regex: https-metrics
  - source_labels:
    - __meta_kubernetes_endpoint_address_target_kind
    - __meta_kubernetes_endpoint_address_target_name
    separator: ;
    regex: Node;(.*)
    replacement: ${1}
    target_label: node
  - source_labels:
    - __meta_kubernetes_endpoint_address_target_kind
    - __meta_kubernetes_endpoint_address_target_name
    separator: ;
    regex: Pod;(.*)
    replacement: ${1}
    target_label: pod
  - source_labels:
    - __meta_kubernetes_namespace
    target_label: namespace
  - source_labels:
    - __meta_kubernetes_service_name
    target_label: service
  - source_labels:
    - __meta_kubernetes_pod_name
    target_label: pod
  - source_labels:
    - __meta_kubernetes_pod_container_name
    target_label: container
  - action: drop
    source_labels:
    - __meta_kubernetes_pod_phase
    regex: (Failed|Succeeded)
  - source_labels:
    - __meta_kubernetes_service_name
    target_label: job
    replacement: ${1}
  - source_labels:
    - __meta_kubernetes_service_label_k8s_app
    target_label: job
    regex: (.+)
    replacement: ${1}
  - target_label: endpoint
    replacement: https-metrics
  - source_labels:
    - __metrics_path__
    target_label: metrics_path
    action: replace
  - source_labels:
    - __address__
    target_label: __tmp_hash
    modulus: 1
    action: hashmod
  - source_labels:
    - __tmp_hash
    regex: $(SHARD)
    action: keep
  metric_relabel_configs:
  - source_labels:
    - __name__
    regex: container_cpu_(cfs_throttled_seconds_total|load_average_10s|system_seconds_total|user_seconds_total)
    action: drop
  - source_labels:
    - __name__
    regex: container_fs_(io_current|io_time_seconds_total|io_time_weighted_seconds_total|reads_merged_total|sector_reads_total|sector_writes_total|writes_merged_total)
    action: drop
  - source_labels:
    - __name__
    regex: container_memory_(mapped_file|swap)
    action: drop
  - source_labels:
    - __name__
    regex: container_(file_descriptors|tasks_state|threads_max)
    action: drop
  - source_labels:
    - __name__
    regex: container_spec.*
    action: drop
  - source_labels:
    - id
    - pod
    regex: .+;
    action: drop
- job_name: serviceMonitor/aml-monitoring/aml-monitoring-kube-promet-kubelet/2
  honor_labels: true
  kubernetes_sd_configs:
  - role: endpoints
    namespaces:
      names:
      - kube-system
  metrics_path: /metrics/probes
  scheme: https
  tls_config:
    insecure_skip_verify: true
    ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
  bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
  relabel_configs:
  - source_labels:
    - job
    target_label: __tmp_prometheus_job_name
  - action: keep
    source_labels:
    - __meta_kubernetes_service_label_app_kubernetes_io_name
    - __meta_kubernetes_service_labelpresent_app_kubernetes_io_name
    regex: (kubelet);true
  - action: keep
    source_labels:
    - __meta_kubernetes_service_label_k8s_app
    - __meta_kubernetes_service_labelpresent_k8s_app
    regex: (kubelet);true
  - action: keep
    source_labels:
    - __meta_kubernetes_endpoint_port_name
    regex: https-metrics
  - source_labels:
    - __meta_kubernetes_endpoint_address_target_kind
    - __meta_kubernetes_endpoint_address_target_name
    separator: ;
    regex: Node;(.*)
    replacement: ${1}
    target_label: node
  - source_labels:
    - __meta_kubernetes_endpoint_address_target_kind
    - __meta_kubernetes_endpoint_address_target_name
    separator: ;
    regex: Pod;(.*)
    replacement: ${1}
    target_label: pod
  - source_labels:
    - __meta_kubernetes_namespace
    target_label: namespace
  - source_labels:
    - __meta_kubernetes_service_name
    target_label: service
  - source_labels:
    - __meta_kubernetes_pod_name
    target_label: pod
  - source_labels:
    - __meta_kubernetes_pod_container_name
    target_label: container
  - action: drop
    source_labels:
    - __meta_kubernetes_pod_phase
    regex: (Failed|Succeeded)
  - source_labels:
    - __meta_kubernetes_service_name
    target_label: job
    replacement: ${1}
  - source_labels:
    - __meta_kubernetes_service_label_k8s_app
    target_label: job
    regex: (.+)
    replacement: ${1}
  - target_label: endpoint
    replacement: https-metrics
  - source_labels:
    - __metrics_path__
    target_label: metrics_path
    action: replace
  - source_labels:
    - __address__
    target_label: __tmp_hash
    modulus: 1
    action: hashmod
  - source_labels:
    - __tmp_hash
    regex: $(SHARD)
    action: keep
  metric_relabel_configs: []
- job_name: serviceMonitor/aml-monitoring/aml-monitoring-kube-promet-operator/0
  honor_labels: true
  kubernetes_sd_configs:
  - role: endpoints
    namespaces:
      names:
      - aml-monitoring
  scheme: https
  tls_config:
    insecure_skip_verify: false
    ca_file: /etc/prometheus/certs/secret_aml-monitoring_aml-monitoring-kube-promet-admission_ca
    server_name: aml-monitoring-kube-promet-operator
  relabel_configs:
  - source_labels:
    - job
    target_label: __tmp_prometheus_job_name
  - action: keep
    source_labels:
    - __meta_kubernetes_service_label_app
    - __meta_kubernetes_service_labelpresent_app
    regex: (kube-prometheus-stack-operator);true
  - action: keep
    source_labels:
    - __meta_kubernetes_service_label_release
    - __meta_kubernetes_service_labelpresent_release
    regex: (aml-monitoring);true
  - action: keep
    source_labels:
    - __meta_kubernetes_endpoint_port_name
    regex: https
  - source_labels:
    - __meta_kubernetes_endpoint_address_target_kind
    - __meta_kubernetes_endpoint_address_target_name
    separator: ;
    regex: Node;(.*)
    replacement: ${1}
    target_label: node
  - source_labels:
    - __meta_kubernetes_endpoint_address_target_kind
    - __meta_kubernetes_endpoint_address_target_name
    separator: ;
    regex: Pod;(.*)
    replacement: ${1}
    target_label: pod
  - source_labels:
    - __meta_kubernetes_namespace
    target_label: namespace
  - source_labels:
    - __meta_kubernetes_service_name
    target_label: service
  - source_labels:
    - __meta_kubernetes_pod_name
    target_label: pod
  - source_labels:
    - __meta_kubernetes_pod_container_name
    target_label: container
  - action: drop
    source_labels:
    - __meta_kubernetes_pod_phase
    regex: (Failed|Succeeded)
  - source_labels:
    - __meta_kubernetes_service_name
    target_label: job
    replacement: ${1}
  - target_label: endpoint
    replacement: https
  - source_labels:
    - __address__
    target_label: __tmp_hash
    modulus: 1
    action: hashmod
  - source_labels:
    - __tmp_hash
    regex: $(SHARD)
    action: keep
  metric_relabel_configs: []
- job_name: serviceMonitor/aml-monitoring/aml-monitoring-kube-promet-prometheus/0
  honor_labels: false
  kubernetes_sd_configs:
  - role: endpoints
    namespaces:
      names:
      - aml-monitoring
  metrics_path: /metrics
  relabel_configs:
  - source_labels:
    - job
    target_label: __tmp_prometheus_job_name
  - action: keep
    source_labels:
    - __meta_kubernetes_service_label_app
    - __meta_kubernetes_service_labelpresent_app
    regex: (kube-prometheus-stack-prometheus);true
  - action: keep
    source_labels:
    - __meta_kubernetes_service_label_release
    - __meta_kubernetes_service_labelpresent_release
    regex: (aml-monitoring);true
  - action: keep
    source_labels:
    - __meta_kubernetes_service_label_self_monitor
    - __meta_kubernetes_service_labelpresent_self_monitor
    regex: (true);true
  - action: keep
    source_labels:
    - __meta_kubernetes_endpoint_port_name
    regex: http-web
  - source_labels:
    - __meta_kubernetes_endpoint_address_target_kind
    - __meta_kubernetes_endpoint_address_target_name
    separator: ;
    regex: Node;(.*)
    replacement: ${1}
    target_label: node
  - source_labels:
    - __meta_kubernetes_endpoint_address_target_kind
    - __meta_kubernetes_endpoint_address_target_name
    separator: ;
    regex: Pod;(.*)
    replacement: ${1}
    target_label: pod
  - source_labels:
    - __meta_kubernetes_namespace
    target_label: namespace
  - source_labels:
    - __meta_kubernetes_service_name
    target_label: service
  - source_labels:
    - __meta_kubernetes_pod_name
    target_label: pod
  - source_labels:
    - __meta_kubernetes_pod_container_name
    target_label: container
  - action: drop
    source_labels:
    - __meta_kubernetes_pod_phase
    regex: (Failed|Succeeded)
  - source_labels:
    - __meta_kubernetes_service_name
    target_label: job
    replacement: ${1}
  - target_label: endpoint
    replacement: http-web
  - source_labels:
    - __address__
    target_label: __tmp_hash
    modulus: 1
    action: hashmod
  - source_labels:
    - __tmp_hash
    regex: $(SHARD)
    action: keep
  metric_relabel_configs: []
- job_name: serviceMonitor/aml-monitoring/aml-monitoring-kube-promet-prometheus/1
  honor_labels: false
  kubernetes_sd_configs:
  - role: endpoints
    namespaces:
      names:
      - aml-monitoring
  metrics_path: /metrics
  relabel_configs:
  - source_labels:
    - job
    target_label: __tmp_prometheus_job_name
  - action: keep
    source_labels:
    - __meta_kubernetes_service_label_app
    - __meta_kubernetes_service_labelpresent_app
    regex: (kube-prometheus-stack-prometheus);true
  - action: keep
    source_labels:
    - __meta_kubernetes_service_label_release
    - __meta_kubernetes_service_labelpresent_release
    regex: (aml-monitoring);true
  - action: keep
    source_labels:
    - __meta_kubernetes_service_label_self_monitor
    - __meta_kubernetes_service_labelpresent_self_monitor
    regex: (true);true
  - action: keep
    source_labels:
    - __meta_kubernetes_endpoint_port_name
    regex: reloader-web
  - source_labels:
    - __meta_kubernetes_endpoint_address_target_kind
    - __meta_kubernetes_endpoint_address_target_name
    separator: ;
    regex: Node;(.*)
    replacement: ${1}
    target_label: node
  - source_labels:
    - __meta_kubernetes_endpoint_address_target_kind
    - __meta_kubernetes_endpoint_address_target_name
    separator: ;
    regex: Pod;(.*)
    replacement: ${1}
    target_label: pod
  - source_labels:
    - __meta_kubernetes_namespace
    target_label: namespace
  - source_labels:
    - __meta_kubernetes_service_name
    target_label: service
  - source_labels:
    - __meta_kubernetes_pod_name
    target_label: pod
  - source_labels:
    - __meta_kubernetes_pod_container_name
    target_label: container
  - action: drop
    source_labels:
    - __meta_kubernetes_pod_phase
    regex: (Failed|Succeeded)
  - source_labels:
    - __meta_kubernetes_service_name
    target_label: job
    replacement: ${1}
  - target_label: endpoint
    replacement: reloader-web
  - source_labels:
    - __address__
    target_label: __tmp_hash
    modulus: 1
    action: hashmod
  - source_labels:
    - __tmp_hash
    regex: $(SHARD)
    action: keep
  metric_relabel_configs: []
- job_name: serviceMonitor/aml-monitoring/aml-monitoring-kube-state-metrics/0
  honor_labels: true
  kubernetes_sd_configs:
  - role: endpoints
    namespaces:
      names:
      - aml-monitoring
  relabel_configs:
  - source_labels:
    - job
    target_label: __tmp_prometheus_job_name
  - action: keep
    source_labels:
    - __meta_kubernetes_service_label_app_kubernetes_io_instance
    - __meta_kubernetes_service_labelpresent_app_kubernetes_io_instance
    regex: (aml-monitoring);true
  - action: keep
    source_labels:
    - __meta_kubernetes_service_label_app_kubernetes_io_name
    - __meta_kubernetes_service_labelpresent_app_kubernetes_io_name
    regex: (kube-state-metrics);true
  - action: keep
    source_labels:
    - __meta_kubernetes_endpoint_port_name
    regex: http
  - source_labels:
    - __meta_kubernetes_endpoint_address_target_kind
    - __meta_kubernetes_endpoint_address_target_name
    separator: ;
    regex: Node;(.*)
    replacement: ${1}
    target_label: node
  - source_labels:
    - __meta_kubernetes_endpoint_address_target_kind
    - __meta_kubernetes_endpoint_address_target_name
    separator: ;
    regex: Pod;(.*)
    replacement: ${1}
    target_label: pod
  - source_labels:
    - __meta_kubernetes_namespace
    target_label: namespace
  - source_labels:
    - __meta_kubernetes_service_name
    target_label: service
  - source_labels:
    - __meta_kubernetes_pod_name
    target_label: pod
  - source_labels:
    - __meta_kubernetes_pod_container_name
    target_label: container
  - action: drop
    source_labels:
    - __meta_kubernetes_pod_phase
    regex: (Failed|Succeeded)
  - source_labels:
    - __meta_kubernetes_service_name
    target_label: job
    replacement: ${1}
  - source_labels:
    - __meta_kubernetes_service_label_app_kubernetes_io_name
    target_label: job
    regex: (.+)
    replacement: ${1}
  - target_label: endpoint
    replacement: http
  - source_labels:
    - __address__
    target_label: __tmp_hash
    modulus: 1
    action: hashmod
  - source_labels:
    - __tmp_hash
    regex: $(SHARD)
    action: keep
  metric_relabel_configs: []
- job_name: serviceMonitor/aml-monitoring/aml-monitoring-loki/0
  honor_labels: false
  kubernetes_sd_configs:
  - role: endpoints
    namespaces:
      names:
      - aml-monitoring
  scrape_interval: 15s
  metrics_path: /metrics
  scheme: http
  relabel_configs:
  - source_labels:
    - job
    target_label: __tmp_prometheus_job_name
  - action: keep
    source_labels:
    - __meta_kubernetes_service_label_app_kubernetes_io_instance
    - __meta_kubernetes_service_labelpresent_app_kubernetes_io_instance
    regex: (aml-monitoring);true
  - action: keep
    source_labels:
    - __meta_kubernetes_service_label_app_kubernetes_io_name
    - __meta_kubernetes_service_labelpresent_app_kubernetes_io_name
    regex: (loki);true
  - action: drop
    source_labels:
    - __meta_kubernetes_service_label_prometheus_io_service_monitor
    - __meta_kubernetes_service_labelpresent_prometheus_io_service_monitor
    regex: (false);true
  - action: keep
    source_labels:
    - __meta_kubernetes_endpoint_port_name
    regex: http-metrics
  - source_labels:
    - __meta_kubernetes_endpoint_address_target_kind
    - __meta_kubernetes_endpoint_address_target_name
    separator: ;
    regex: Node;(.*)
    replacement: ${1}
    target_label: node
  - source_labels:
    - __meta_kubernetes_endpoint_address_target_kind
    - __meta_kubernetes_endpoint_address_target_name
    separator: ;
    regex: Pod;(.*)
    replacement: ${1}
    target_label: pod
  - source_labels:
    - __meta_kubernetes_namespace
    target_label: namespace
  - source_labels:
    - __meta_kubernetes_service_name
    target_label: service
  - source_labels:
    - __meta_kubernetes_pod_name
    target_label: pod
  - source_labels:
    - __meta_kubernetes_pod_container_name
    target_label: container
  - action: drop
    source_labels:
    - __meta_kubernetes_pod_phase
    regex: (Failed|Succeeded)
  - source_labels:
    - __meta_kubernetes_service_name
    target_label: job
    replacement: ${1}
  - target_label: endpoint
    replacement: http-metrics
  - source_labels:
    - job
    target_label: job
    replacement: aml-monitoring/$1
    action: replace
  - target_label: cluster
    replacement: aml-monitoring-loki
    action: replace
  - source_labels:
    - __address__
    target_label: __tmp_hash
    modulus: 1
    action: hashmod
  - source_labels:
    - __tmp_hash
    regex: $(SHARD)
    action: keep
  metric_relabel_configs: []
- job_name: serviceMonitor/aml-monitoring/aml-monitoring-prometheus-node-exporter/0
  honor_labels: false
  kubernetes_sd_configs:
  - role: endpoints
    namespaces:
      names:
      - aml-monitoring
    attach_metadata:
      node: false
  scheme: http
  relabel_configs:
  - source_labels:
    - job
    target_label: __tmp_prometheus_job_name
  - action: keep
    source_labels:
    - __meta_kubernetes_service_label_app_kubernetes_io_instance
    - __meta_kubernetes_service_labelpresent_app_kubernetes_io_instance
    regex: (aml-monitoring);true
  - action: keep
    source_labels:
    - __meta_kubernetes_service_label_app_kubernetes_io_name
    - __meta_kubernetes_service_labelpresent_app_kubernetes_io_name
    regex: (prometheus-node-exporter);true
  - action: keep
    source_labels:
    - __meta_kubernetes_endpoint_port_name
    regex: http-metrics
  - source_labels:
    - __meta_kubernetes_endpoint_address_target_kind
    - __meta_kubernetes_endpoint_address_target_name
    separator: ;
    regex: Node;(.*)
    replacement: ${1}
    target_label: node
  - source_labels:
    - __meta_kubernetes_endpoint_address_target_kind
    - __meta_kubernetes_endpoint_address_target_name
    separator: ;
    regex: Pod;(.*)
    replacement: ${1}
    target_label: pod
  - source_labels:
    - __meta_kubernetes_namespace
    target_label: namespace
  - source_labels:
    - __meta_kubernetes_service_name
    target_label: service
  - source_labels:
    - __meta_kubernetes_pod_name
    target_label: pod
  - source_labels:
    - __meta_kubernetes_pod_container_name
    target_label: container
  - action: drop
    source_labels:
    - __meta_kubernetes_pod_phase
    regex: (Failed|Succeeded)
  - source_labels:
    - __meta_kubernetes_service_name
    target_label: job
    replacement: ${1}
  - source_labels:
    - __meta_kubernetes_service_label_jobLabel
    target_label: job
    regex: (.+)
    replacement: ${1}
  - target_label: endpoint
    replacement: http-metrics
  - source_labels:
    - __address__
    target_label: __tmp_hash
    modulus: 1
    action: hashmod
  - source_labels:
    - __tmp_hash
    regex: $(SHARD)
    action: keep
  metric_relabel_configs: []
storage:
  tsdb:
    out_of_order_time_window: 0s
alerting:
  alert_relabel_configs:
  - action: labeldrop
    regex: prometheus_replica
  alertmanagers:
  - path_prefix: /
    scheme: http
    kubernetes_sd_configs:
    - role: endpoints
      namespaces:
        names:
        - aml-monitoring
    api_version: v2
    relabel_configs:
    - action: keep
      source_labels:
      - __meta_kubernetes_service_name
      regex: aml-monitoring-kube-promet-alertmanager
    - action: keep
      source_labels:
      - __meta_kubernetes_endpoint_port_name
      regex: http-web

image

johnswarbrick-napier commented 1 year ago

Well, so this is interesting.

This is what I'm getting on my Amazon EKS cluster:

kubectl get podMonitor -n aml -o yaml

apiVersion: v1
items: []
kind: List
metadata:
  resourceVersion: ""

kubectl get serviceMonitor -n aml -o yaml

apiVersion: v1
items: []
kind: List
metadata:
  resourceVersion: ""

Whereas this is what I get on my Azure AKS cluster, running exactly the same applications in the aml namespace, and the exact same Prometheus versions and configuration:

kubectl get podMonitor -n aml -o yaml

apiVersion: v1
items:
- apiVersion: monitoring.coreos.com/v1
  kind: PodMonitor
  metadata:
    annotations:
      kubectl.kubernetes.io/last-applied-configuration: |
        {"apiVersion":"monitoring.coreos.com/v1","kind":"PodMonitor","metadata":{"annotations":{},"labels":{"argocd.argoproj.io/instance":"devx-prod-aml"},"name":"kafka-cluster-operator-metrics","namespace":"aml"},"spec":{"podMetricsEndpoints":[{"path":"/metrics","port":"http"}],"selector":{"matchLabels":{"strimzi.io/kind":"cluster-operator"}}}}
    creationTimestamp: "2023-07-04T16:23:23Z"
    generation: 1
    labels:
      argocd.argoproj.io/instance: devx-prod-aml
    name: kafka-cluster-operator-metrics
    namespace: aml
    resourceVersion: "34474"
    uid: 69e5adeb-d69d-4b58-b45a-06bf514ab7bf
  spec:
    podMetricsEndpoints:
    - path: /metrics
      port: http
    selector:
      matchLabels:
        strimzi.io/kind: cluster-operator
- apiVersion: monitoring.coreos.com/v1
  kind: PodMonitor
  metadata:
    annotations:
      kubectl.kubernetes.io/last-applied-configuration: |
        {"apiVersion":"monitoring.coreos.com/v1","kind":"PodMonitor","metadata":{"annotations":{},"labels":{"argocd.argoproj.io/instance":"devx-prod-aml"},"name":"kafka-entity-operator-metrics","namespace":"aml"},"spec":{"podMetricsEndpoints":[{"path":"/metrics","port":"healthcheck"}],"selector":{"matchLabels":{"app.kubernetes.io/name":"entity-operator"}}}}
    creationTimestamp: "2023-07-04T16:23:23Z"
    generation: 1
    labels:
      argocd.argoproj.io/instance: devx-prod-aml
    name: kafka-entity-operator-metrics
    namespace: aml
    resourceVersion: "34473"
    uid: b74d8a03-22f3-47e0-94f3-fd50fb9bfe64
  spec:
    podMetricsEndpoints:
    - path: /metrics
      port: healthcheck
    selector:
      matchLabels:
        app.kubernetes.io/name: entity-operator

kubectl get serviceMonitor -n aml -o yaml

apiVersion: v1
items:
- apiVersion: monitoring.coreos.com/v1
  kind: ServiceMonitor
  metadata:
    annotations:
      kubectl.kubernetes.io/last-applied-configuration: |
        {"apiVersion":"monitoring.coreos.com/v1","kind":"ServiceMonitor","metadata":{"annotations":{},"labels":{"app":"nifi","argocd.argoproj.io/instance":"devx-prod-aml","chart":"nifi-1.1.42","heritage":"Helm","release":"devx-prod-aml"},"name":"nifi","namespace":"aml"},"spec":{"endpoints":[{"honorLabels":true,"port":"metrics"}],"namespaceSelector":{"matchNames":["aml"]},"selector":{"matchLabels":{"app":"nifi","release":"devx-prod-aml"}}}}
    creationTimestamp: "2023-07-04T16:23:23Z"
    generation: 1
    labels:
      app: nifi
      argocd.argoproj.io/instance: devx-prod-aml
      chart: nifi-1.1.42
      heritage: Helm
      release: devx-prod-aml
    name: nifi
    namespace: aml
    resourceVersion: "34480"
    uid: 59432a37-5209-483f-8829-b7efa076f273
  spec:
    endpoints:
    - honorLabels: true
      port: metrics
    namespaceSelector:
      matchNames:
      - aml
    selector:
      matchLabels:
        app: nifi
        release: devx-prod-aml

Here are the APIs:

image

Why would this be different on EKS (not working) vs AKS (working)?

I cannot find any error messages or failures anywhere!

sebastianlutter commented 1 year ago

I guess we have the similar problem, yes (https://github.com/prometheus-community/helm-charts/issues/3487)

I tested it with local kind cluster as well as with k3s on hetzner cloud. It can discover ServiceMonitor when in the same namespace as the kube-prometheus-stack, the default or the kube-system namespace. But when I use a own namespace then it is not found.

In my case my application namespace is pixolution

The relevant part in my values.yaml looks like this:

prometheus:
  enabled: true
  prometheusSpec:
    nodeSelector:
      node-type: app
    serviceMonitorSelectorNilUsesHelmValues: false
    serviceMonitorSelector:
      matchLabels:
        release: kube-prometheus-stack
    serviceMonitorNamespaceSelector:
      matchExpressions:
        - key: name
          operator: In
          values:
            - monitoring
            - pixolution
            - kube-system

My ServiceMonitor has the proper label (node-type: app) and is deployed to pixolution namespace, but it is not discovered due to permission issues. I tried to solve this by creating Role and RoleBinding in pixolution namespace for the existing ServiceAccount kube-prometheus-stack-operator, but it did not work. Then I gave up

kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  labels:
    app: kube-prometheus-stack-prometheus
    app.kubernetes.io/instance: kube-prometheus-stack
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/part-of: kube-prometheus-stack
  name: prometheus-kafka-clusterrole
rules:
  - verbs:
      - get
      - list
      - watch
    apiGroups:
      - ''
    resources:
      - nodes
      - nodes/metrics
      - services
      - endpoints
      - pods
  - verbs:
      - get
      - list
      - watch
    apiGroups:
      - networking.k8s.io
    resources:
      - ingresses
  - verbs:
      - get
      - list
      - watch
    nonResourceURLs:
      - /metrics
      - /metrics/cadvisor

---

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  labels:
    app: kube-prometheus-stack-prometheus
    app.kubernetes.io/instance: kube-prometheus-stack
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/part-of: kube-prometheus-stack
  name: prometheus-kafka
subjects:
  - kind: ServiceAccount
    name: kube-prometheus-stack-operator
    namespace: monitoring
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: prometheus-kafka-clusterrole

see https://github.com/pixolution/kube-prometheus-stack-plus-kafka for the full example code

stale[bot] commented 1 year ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Any further update will cause the issue/pull request to no longer be considered stale. Thank you for your contributions.

johnswarbrick-napier commented 1 year ago

Managed to find the root cause.

If the applications are installed before the pod/servicemonitor CRDs are installed, then they do not create pod/servicemonitors and therefore there is nothing for Prometheus to discover.

Simply deleting the pods doesn't fix it, you have to install Prometheus and then re-deploy the applications with a helm upgrade or similar.

After this, the pod/servicemonitors will be created and Prometheus can discover them.