prometheus-community / helm-charts

Prometheus community Helm charts
Apache License 2.0
4.98k stars 4.99k forks source link

[kube-prometheus-stack] prometheus-kube-prometheus-admission-create getting stucked #3584

Open ShivangKumarSingh opened 1 year ago

ShivangKumarSingh commented 1 year ago

Describe the bug a clear and concise description of what the bug is.

We are trying to upgrade the Prometheus version from 39.13.1 to 43.0.0 and while doing so getting error. When tried to upgrade, it was getting stuck in uninstall state giving this error "helm.go:84: [debug] failed to delete release: prometheus" . If we remove it manually and when tried to do fresh install of 43.0.0, we are getting this error " _client.go:735: [debug] Add/Modify event for prometheus-kube-prometheus-admission-create: MODIFIED client.go:774: [debug] prometheus-kube-prometheus-admission-create: Jobs active: 1, jobs failed: 0, jobs succeeded: 0

[debug]Re-evaluate condition on job cancellation for step: 'Deploy Platform'._ after this it was cancelled after some time.

AKS version is 1.25 Helm version: 3.11.3

What's your helm version?

3.11.3

What's your kubectl version?

1.25.0

Which chart?

kube-prometheus-stack

What's the chart version?

43.0.0

What happened?

We are trying to upgrade the Prometheus version from 39.13.1 to 43.0.0 and while doing so getting error. When tried to upgrade, it was getting stuck in uninstall state giving this error "helm.go:84: [debug] failed to delete release: prometheus" . If we remove it manually and when tried to do fresh install of 43.0.0, we are getting this error " _client.go:735: [debug] Add/Modify event for prometheus-kube-prometheus-admission-create: MODIFIED client.go:774: [debug] prometheus-kube-prometheus-admission-create: Jobs active: 1, jobs failed: 0, jobs succeeded: 0

[debug]Re-evaluate condition on job cancellation for step: 'Deploy Platform'._ after this it was cancelled after some time.

AKS version is 1.25 Helm version: 3.11.3

What you expected to happen?

That it will uninstall successfully and install the new version or directly install new version successfully after no version was present.

How to reproduce it?

For fresh Installation : helm upgrade prometheus prometheus-community/kube-prometheus-stack --install --version 43.0.0 --debug --wait --reset-values --timeout 24000s --create-namespace --namespace namespace_name -f <(envsubst < $(dirname $BASH_SOURCE)/thirdParty/prometheusConfig.yaml)

For uninstalling first and then upgrading it: Delete CRD helm uninstall --debug prometheus -n namespace_name Delete all cluster role helm upgrade prometheus prometheus-community/kube-prometheus-stack --version 43.0.0 --debug --wait --reset-values --timeout 24000s --create-namespace --namespace namespace_name -f <(envsubst < $(dirname $BASH_SOURCE)/thirdParty/prometheusConfig.yaml)

Enter the changed values of values.yaml?

No response

Enter the command that you execute and failing/misfunctioning.

helm upgrade prometheus prometheus-community/kube-prometheus-stack --install --version 43.0.0 --debug --wait --reset-values --timeout 24000s --create-namespace --namespace namespace_name -f <(envsubst < $(dirname $BASH_SOURCE)/thirdParty/prometheusConfig.yaml)

helm uninstall --debug prometheus -n namespace_name

Anything else we need to know?

prometheusConfig.yaml

nameOverride: "" namespaceOverride: "" kubeTargetVersionOverride: "" fullnameOverride: "" commonLabels: {} defaultRules: create: true rules: alertmanager: false etcd: false general: true k8s: false kubeApiserver: false kubeApiserverAvailability: false kubeApiserverError: false kubeApiserverSlos: false kubelet: false kubePrometheusGeneral: false kubePrometheusNodeAlerting: false kubePrometheusNodeRecording: false kubernetesAbsent: false kubernetesApps: false kubernetesResources: false kubernetesStorage: false kubernetesSystem: false kubeScheduler: false kubeStateMetrics: false network: false node: false prometheus: false prometheusOperator: false time: true

Runbook url prefix for default rules

runbookUrl: https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#

Reduce app namespace alert scope

appNamespacesTarget: ".*"

Labels for default rules

labels: {}

Annotations for default rules

annotations: {}

Additional labels for PrometheusRule alerts

additionalRuleLabels: {} additionalPrometheusRulesMap: {} global: rbac: create: true pspEnabled: false pspAnnotations: {} imagePullSecrets: [] alertmanager:

enabled: false

apiVersion: v2

serviceAccount: create: true name: "" annotations: {} podDisruptionBudget: enabled: false minAvailable: 1 maxUnavailable: ""

config: global: resolve_timeout: 5m route: group_by: ["job"] group_wait: 30s group_interval: 5m repeat_interval: 12h receiver: "null" routes:

Using default values from https://github.com/grafana/helm-charts/blob/main/charts/grafana/values.yaml

grafana: enabled: true priorityClassName: "" podLabels: app: prometheus-grafana

grafana.ini: users: viewers_can_edit: false auto_assign_org_role: Editor auto_assign_org: true auth: disable_login_form: true disable_signout_menu: true auth.anonymous: enabled: true org_role: Viewer
auth.basic: enabled: false
auth.proxy: enabled: true header_name: X-GRAFANA-USER header_property: username auto_sign_up: true sync_ttl: 60 server: domain: "${}" root_url: "https://" serve_from_sub_path: true security: allow_embedding: true live: max_connections: 0

namespaceOverride: ""

Deploy default dashboards.

defaultDashboardsEnabled: false

adminPassword: prom-operator

persistence: type: pvc enabled: true accessModes:

kubeApiServer: enabled: false tlsConfig: serverName: kubernetes insecureSkipVerify: false

If your API endpoint address is not reachable (as in AKS) you can replace it with the kubernetes service

relabelings: []

serviceMonitor:

Scrape interval. If not set, the Prometheus default scrape interval is used.

##
interval: ""
jobLabel: component
selector:
  matchLabels:
    component: apiserver
    provider: kubernetes

##  metric relabel configs to apply to samples before ingestion.
##
metricRelabelings: []

kubelet: enabled: false namespace: kube-system

serviceMonitor:

Scrape interval. If not set, the Prometheus default scrape interval is used.

##
interval: ""

## Enable scraping the kubelet over https. For requirements to enable this see
## https://github.com/prometheus-operator/prometheus-operator/issues/926
##
https: true

## Enable scraping /metrics/cadvisor from kubelet's service
##
cAdvisor: true

## Enable scraping /metrics/probes from kubelet's service
##
probes: true

## Enable scraping /metrics/resource from kubelet's service
## This is disabled by default because container metrics are already exposed by cAdvisor
##
resource: false
# From kubernetes 1.18, /metrics/resource/v1alpha1 renamed to /metrics/resource
resourcePath: "/metrics/resource"
## Metric relabellings to apply to samples before ingestion
##
cAdvisorMetricRelabelings: []

probesMetricRelabelings: []

cAdvisorRelabelings:
  - sourceLabels: [__metrics_path__]
    targetLabel: metrics_path

probesRelabelings:
  - sourceLabels: [__metrics_path__]
    targetLabel: metrics_path

resourceRelabelings:
  - sourceLabels: [__metrics_path__]
    targetLabel: metrics_path

metricRelabelings: []

relabelings:
  - sourceLabels: [__metrics_path__]
    targetLabel: metrics_path

kubeControllerManager: enabled: false

If your kube controller manager is not deployed as a pod, specify IPs it can be found on

endpoints: []

service: port: Provided Values targetPort: Provided Values

selector:

#   component: kube-controller-manager

serviceMonitor:

Scrape interval. If not set, the Prometheus default scrape interval is used.

##
interval: ""

## Enable scraping kube-controller-manager over https.
## Requires proper certs (not self-signed) and delegated authentication/authorization checks
##
https: false

# Skip TLS certificate validation when scraping
insecureSkipVerify: null

# Name of the server to use when validating TLS certificate
serverName: null

##  metric relabel configs to apply to samples before ingestion.
##
metricRelabelings: []

relabelings: []

coreDns: enabled: false service: port: Provided Values targetPort: Provided Values

selector:

#   k8s-app: kube-dns

serviceMonitor:

Scrape interval. If not set, the Prometheus default scrape interval is used.

##
interval: ""

##  metric relabel configs to apply to samples before ingestion.
##
metricRelabelings: []

relabelings: []

kubeDns: enabled: false service: dnsmasq: port: Provided Values targetPort: Provided Values skydns: port: Provided Values targetPort: Provided Values

selector:

#   k8s-app: kube-dns

serviceMonitor:

Scrape interval. If not set, the Prometheus default scrape interval is used.

##
interval: ""

##  metric relabel configs to apply to samples before ingestion.
##
metricRelabelings: []

relabelings: []

dnsmasqMetricRelabelings: []

dnsmasqRelabelings: []

kubeEtcd: enabled: false

If your etcd is not deployed as a pod, specify IPs it can be found on

endpoints: []

- 10.141.4.22

- 10.141.4.23

- 10.141.4.24

Etcd service. If using kubeEtcd.endpoints only the port and targetPort are used

service: port: Provided Values targetPort: Provided Values

serviceMonitor:

Scrape interval. If not set, the Prometheus default scrape interval is used.

##
interval: ""
scheme: http
insecureSkipVerify: false
serverName: ""
caFile: ""
certFile: ""
keyFile: ""

##  metric relabel configs to apply to samples before ingestion.
##
metricRelabelings: []

relabelings: []

kubeScheduler: enabled: false

endpoints: []

service: port: Provided Values targetPort: Provided Values

selector:

#   component: kube-scheduler

serviceMonitor:

Scrape interval. If not set, the Prometheus default scrape interval is used.

##
interval: ""
## Enable scraping kube-scheduler over https.
## Requires proper certs (not self-signed) and delegated authentication/authorization checks
##
https: false

## Skip TLS certificate validation when scraping
insecureSkipVerify: null

## Name of the server to use when validating TLS certificate
serverName: null

##  metric relabel configs to apply to samples before ingestion.
##
metricRelabelings: []

relabelings: []

kubeProxy: enabled: false

If your kube proxy is not deployed as a pod, specify IPs it can be found on

endpoints: []

service: port: 10249 targetPort: 10249

selector:

#   k8s-app: kube-proxy

serviceMonitor:

Scrape interval. If not set, the Prometheus default scrape interval is used.

##
interval: ""

## Enable scraping kube-proxy over https.
## Requires proper certs (not self-signed) and delegated authentication/authorization checks
##
https: false

##  metric relabel configs to apply to samples before ingestion.
##
metricRelabelings: []
relabelings: []

kubeStateMetrics: enabled: true serviceMonitor:

Scrape interval. If not set, the Prometheus default scrape interval is used.

##
interval: ""

##  metric relabel configs to apply to samples before ingestion.
##
metricRelabelings: []

relabelings: []

kube-state-metrics: priorityClassName: "" resources: limits: cpu: 100m memory: 512Mi requests: cpu: 10m memory: 128Mi namespaceOverride: "" rbac: create: true podSecurityPolicy: enabled: false

Deploy node exporter as a daemonset to all nodes

nodeExporter: enabled: true

Use the value configured in prometheus-node-exporter.podLabels

jobLabel: jobLabel

serviceMonitor:

Scrape interval. If not set, the Prometheus default scrape interval is used.

##
interval: ""

## How long until a scrape request times out. If not set, the Prometheus default scape timeout is used.
##
scrapeTimeout: ""

##  metric relabel configs to apply to samples before ingestion.
##
metricRelabelings: []

relabelings: []

prometheus-node-exporter: resources: limits: cpu: 100m memory: 256Mi requests: cpu: 10m memory: 128Mi namespaceOverride: "" podLabels:

Add the 'node-exporter' label to be used by serviceMonitor to match standard common usage in rules and grafana dashboards

##
jobLabel: node-exporter

extraArgs:

Manages Prometheus and Alertmanager components

prometheusOperator: enabled: true requests: cpu: "100m" memory: "2Gi" limits: cpu: "2000m" memory: "5Gi"

Prometheus-Operator v0.39.0 and later support TLS natively.

tls: enabled: true

Admission webhook support for PrometheusRules resources added in Prometheus Operator 0.30 can be enabled to prevent incorrectly formatted

rules from making their way into prometheus and potentially preventing the container from starting

admissionWebhooks: failurePolicy: Fail enabled: true

If enabled, generate a self-signed certificate, then patch the webhook configurations with the generated data.

## On chart upgrades (or if the secret exists) the cert will not be re-generated. You can use this to provide your own
## certs ahead of time if you wish.
##
patch:
  enabled: true
  image:
    repository: registry.k8s.io/ingress-nginx/kube-webhook-certgen
    tag: v1.1.1
    sha: ""
    pullPolicy: IfNotPresent
  resources:
    limits:
      cpu: 200m
      memory: 100Mi
    requests:
      cpu: 100m
      memory: 50Mi
  ## Provide a priority class name to the webhook patching job
  ##
  priorityClassName: ""
  podAnnotations: {}
  nodeSelector: {}
  affinity: {}
  tolerations: []

Namespaces to scope the interaction of the Prometheus Operator and the apiserver (allow list).

This is mutually exclusive with denyNamespaces. Setting this to an empty object will disable the configuration

namespaces: {}

releaseNamespace: true

# additional:
# - kube-system

Namespaces not to scope the interaction of the Prometheus Operator (deny list).

denyNamespaces: []

Filter namespaces to look for prometheus-operator custom resources

alertmanagerInstanceNamespaces: [] prometheusInstanceNamespaces: [] thanosInstanceNamespaces: []

Service account for Alertmanager to use.

ref: https://kubernetes.io/docs/tasks/configure-pod-container/configure-service-account/

serviceAccount: create: true name: ""

Configuration for Prometheus operator service

service: annotations: {} labels: {} clusterIP: ""

## Port to expose on each node
## Only used if service.type is 'NodePort'
##
nodePort: Provided Values

nodePortTls: Provided Values

## Additional ports to open for Prometheus service
## ref: https://kubernetes.io/docs/concepts/services-networking/service/#multi-port-services
##
additionalPorts: []

## Loadbalancer IP
## Only use if service.type is "loadbalancer"
##
loadBalancerIP: ""
loadBalancerSourceRanges: []

## Service type
## NodePort, ClusterIP, loadbalancer
##
type: ClusterIP

## List of IP addresses at which the Prometheus server service is available
## Ref: https://kubernetes.io/docs/user-guide/services/#external-ips
##
externalIPs: []

Labels to add to the operator pod

podLabels: {}

Annotations to add to the operator pod

podAnnotations: {}

Assign a PriorityClassName to pods if set

priorityClassName: ""

kubeletService: enabled: false namespace: kube-system

Create a servicemonitor for the operator

serviceMonitor:

Scrape interval. If not set, the Prometheus default scrape interval is used.

##
interval: ""
## Scrape timeout. If not set, the Prometheus default scrape timeout is used.
scrapeTimeout: ""
selfMonitor: true

##  metric relabel configs to apply to samples before ingestion.
##
metricRelabelings: []
relabelings: []

resources: requests: memory: 6Gi cpu: 1000m limits: memory: 12Gi cpu: 2000m

Required for use in managed kubernetes clusters (such as AWS EKS) with custom CNI (such as calico),

because control-plane managed by AWS cannot communicate with pods' IP CIDR and admission webhooks are not working

hostNetwork: false

Define which Nodes the Pods are scheduled on.

ref: https://kubernetes.io/docs/user-guide/node-selection/

nodeSelector: {}

Tolerations for use with node taints

ref: https://kubernetes.io/docs/concepts/configuration/taint-and-toleration/

tolerations: []

affinity: {}

securityContext: fsGroup: 65534 runAsGroup: 65534 runAsNonRoot: true runAsUser: 65534

Prometheus-operator image

image: repository: quay.io/prometheus-operator/prometheus-operator tag: v0.61.1 sha: "" pullPolicy: IfNotPresent

Configmap-reload image to use for reloading configmaps

configmapReloadImage: repository: docker.io/jimmidyson/configmap-reload tag: v0.4.0 sha: ""

Prometheus-config-reloader image to use for config and rule reloading

prometheusConfigReloaderImage:

image to use for config and rule reloading

image:
  registry: quay.io
  repository: prometheus-operator/prometheus-config-reloader
  tag: v0.61.1
  sha: ""

# resource config for prometheusConfigReloader
resources:
  requests:
    cpu: 100m
    memory: 50Mi
  limits:
    cpu: 100m
    memory: 50Mi

Thanos side-car image when configured

thanosImage: repository: quay.io/thanos/thanos tag: v0.29.0 sha: ""

Set a Field Selector to filter watched secrets

secretFieldSelector: ""

Deploy a Prometheus instance

prometheus: enabled: true

Annotations for Prometheus

annotations: {}

Service account for Prometheuses to use.

ref: https://kubernetes.io/docs/tasks/configure-pod-container/configure-service-account/

serviceAccount: create: true name: ""

Configuration for Prometheus service

service: annotations: {} labels: {} clusterIP: ""

## Port for Prometheus Service to listen on
##
port: Provided Values

## To be used with a proxy extraContainer port
targetPort: Provided Values

## List of IP addresses at which the Prometheus server service is available
## Ref: https://kubernetes.io/docs/user-guide/services/#external-ips
##
externalIPs: []

## Port to expose on each node
## Only used if service.type is 'NodePort'
##
nodePort: Provided Values

## Loadbalancer IP
## Only use if service.type is "loadbalancer"
loadBalancerIP: ""
loadBalancerSourceRanges: []
## Service type
##
type: ClusterIP

sessionAffinity: ""

Configuration for creating a separate Service for each statefulset Prometheus replica

servicePerReplica: enabled: false annotations: {}

## Port for Prometheus Service per replica to listen on
##
port: Provided Values

## To be used with a proxy extraContainer port
targetPort: Provided Values

## Port to expose on each node
## Only used if servicePerReplica.type is 'NodePort'
##
nodePort: Provided Values

## Loadbalancer source IP ranges
## Only used if servicePerReplica.type is "loadbalancer"
loadBalancerSourceRanges: []
## Service type
##
type: ClusterIP

podDisruptionBudget: enabled: false minAvailable: 1 maxUnavailable: ""

Ingress exposes thanos sidecar outside the clsuter

thanosIngress: enabled: false

annotations: {}
labels: {}
servicePort: Provided Values
## Hosts must be provided if Ingress is enabled.
##
hosts: []
  # - thanos-gateway.domain.com

## Paths to use for ingress rules
##
paths: []
tls: []
# - secretName: thanos-gateway-tls
#   hosts:
#   - thanos-gateway.domain.com

ingress: enabled: false

# For Kubernetes >= 1.18 you should specify the ingress-controller via the field ingressClassName
# See https://kubernetes.io/blog/2020/04/02/improvements-to-the-ingress-api-in-kubernetes-1.18/#specifying-the-class-of-an-ingress
# ingressClassName: nginx

annotations: {}
labels: {}

## Hostnames.
## Must be provided if Ingress is enabled.
##
# hosts:
#   - prometheus.domain.com
hosts: []

## Paths to use for ingress rules - one path should match the prometheusSpec.routePrefix
##
paths: []
# - /

## TLS configuration for Prometheus Ingress
## Secret must be manually created in the namespace
##
tls: []
  # - secretName: prometheus-general-tls
  #   hosts:
  #     - prometheus.example.com

Configuration for creating an Ingress that will map to each Prometheus replica service

prometheus.servicePerReplica must be enabled

ingressPerReplica: enabled: false

# For Kubernetes >= 1.18 you should specify the ingress-controller via the field ingressClassName
# See https://kubernetes.io/blog/2020/04/02/improvements-to-the-ingress-api-in-kubernetes-1.18/#specifying-the-class-of-an-ingress
# ingressClassName: nginx

annotations: {}
labels: {}

hostPrefix: ""
## Domain that will be used for the per replica ingress
hostDomain: ""

## Paths to use for ingress rules
##
paths: []
# - /

## Secret name containing the TLS certificate for Prometheus per replica ingress
## Secret must be manually created in the namespace
tlsSecretName: ""

## Separated secret for each per replica Ingress. Can be used together with cert-manager
##
tlsSecretPerReplica:
  enabled: false
  ## Final form of the secret for each per replica ingress is
  ## {{ tlsSecretPerReplica.prefix }}-{{ $replicaNumber }}
  ##
  prefix: "prometheus"

Configure additional options for default pod security policy for Prometheus

ref: https://kubernetes.io/docs/concepts/policy/pod-security-policy/

podSecurityPolicy: allowedCapabilities: []

serviceMonitor:

Scrape interval. If not set, the Prometheus default scrape interval is used.

##
interval: ""
selfMonitor: true

## scheme: HTTP scheme to use for scraping. Can be used with `tlsConfig` for example if using istio mTLS.
scheme: ""

## tlsConfig: TLS configuration to use when scraping the endpoint. For example if using istio mTLS.
## Of type: https://github.com/prometheus-operator/prometheus-operator/blob/master/Documentation/api.md#tlsconfig
tlsConfig: {}

bearerTokenFile:

##  metric relabel configs to apply to samples before ingestion.
##
metricRelabelings: []
# - action: keep
#   regex: 'kube_(daemonset|deployment|pod|namespace|node|statefulset).+'
#   sourceLabels: [__name__]

#   relabel configs to apply to samples before ingestion.
##
relabelings: []
# - sourceLabels: [__meta_kubernetes_pod_node_name]
#   separator: ;
#   regex: ^(.*)$
#   targetLabel: nodename
#   replacement: $1
#   action: replace

Settings affecting prometheusSpec

ref: https://github.com/prometheus-operator/prometheus-operator/blob/master/Documentation/api.md#prometheusspec

prometheusSpec:

If true, pass --storage.tsdb.max-block-duration=2h to prometheus. This is already done if using Thanos

##
disableCompaction: false
## APIServerConfig
## ref: https://github.com/prometheus-operator/prometheus-operator/blob/master/Documentation/api.md#apiserverconfig
##
apiserverConfig: {}

## Interval between consecutive scrapes.
##
scrapeInterval: "1m"

## Interval between consecutive evaluations.
##
evaluationInterval: "1m"

## ListenLocal makes the Prometheus server listen on loopback, so that it does not bind against the Pod IP.
##
listenLocal: false

## EnableAdminAPI enables Prometheus the administrative HTTP API which includes functionality such as deleting time series.
## This is disabled by default.
## ref: https://prometheus.io/docs/prometheus/latest/querying/api/#tsdb-admin-apis
##
enableAdminAPI: false

## Image of Prometheus.
##
image:
  repository: quay.io/prometheus/prometheus
  tag: v2.40.5
  sha: ""

## Tolerations for use with node taints
## ref: https://kubernetes.io/docs/concepts/configuration/taint-and-toleration/
##
tolerations: []
#  - key: "key"
#    operator: "Equal"
#    value: "value"
#    effect: "NoSchedule"

## Alertmanagers to which alerts will be sent
## ref: https://github.com/prometheus-operator/prometheus-operator/blob/master/Documentation/api.md#alertmanagerendpoints
##
## Default configuration will connect to the alertmanager deployed as part of this release
##
alertingEndpoints: []
# - name: ""
#   namespace: ""
#   port: http
#   scheme: http
#   pathPrefix: ""
#   tlsConfig: {}
#   bearerTokenFile: ""
#   apiVersion: v2

## External labels to add to any time series or alerts when communicating with external systems
##
externalLabels: {}

## Name of the external label used to denote replica name
##
replicaExternalLabelName: ""

## If true, the Operator won't add the external label used to denote replica name
##
replicaExternalLabelNameClear: false

## Name of the external label used to denote Prometheus instance name
##
prometheusExternalLabelName: ""

## If true, the Operator won't add the external label used to denote Prometheus instance name
##
prometheusExternalLabelNameClear: false

## External URL at which Prometheus will be reachable.
##
externalUrl: ""

## Define which Nodes the Pods are scheduled on.
## ref: https://kubernetes.io/docs/user-guide/node-selection/
##
nodeSelector: {}

## Secrets is a list of Secrets in the same namespace as the Prometheus object, which shall be mounted into the Prometheus Pods.
## The Secrets are mounted into /etc/prometheus/secrets/. Secrets changes after initial creation of a Prometheus object are not
## reflected in the running Pods. To change the secrets mounted into the Prometheus Pods, the object must be deleted and recreated
## with the new list of secrets.
##
secrets: []

## ConfigMaps is a list of ConfigMaps in the same namespace as the Prometheus object, which shall be mounted into the Prometheus Pods.
## The ConfigMaps are mounted into /etc/prometheus/configmaps/.
##
configMaps: []

## QuerySpec defines the query command line flags when starting Prometheus.
## ref: https://github.com/prometheus-operator/prometheus-operator/blob/master/Documentation/api.md#queryspec
##
query: {}

## Namespaces to be selected for PrometheusRules discovery.
## If nil, select own namespace. Namespaces to be selected for ServiceMonitor discovery.
## See https://github.com/prometheus-operator/prometheus-operator/blob/master/Documentation/api.md#namespaceselector for usage
##
ruleNamespaceSelector: {}

## If true, a nil or {} value for prometheus.prometheusSpec.ruleSelector will cause the
## prometheus resource to be created with selectors based on values in the helm deployment,
## which will also match the PrometheusRule resources created
##
ruleSelectorNilUsesHelmValues: true

## PrometheusRules to be selected for target discovery.
## If {}, select all ServiceMonitors
##
ruleSelector: {}

serviceMonitorSelectorNilUsesHelmValues: true

## ServiceMonitors to be selected for target discovery.
## If {}, select all ServiceMonitors
##
serviceMonitorSelector: {}

serviceMonitorNamespaceSelector: {}

podMonitorSelectorNilUsesHelmValues: true

podMonitorSelector: {}

podMonitorNamespaceSelector: {}

## If true, a nil or {} value for prometheus.prometheusSpec.probeSelector will cause the
## prometheus resource to be created with selectors based on values in the helm deployment,
## which will also match the probes created
##
probeSelectorNilUsesHelmValues: true

## Probes to be selected for target discovery.
## If {}, select all Probes
##
probeSelector: {}

probeNamespaceSelector: {}

## How long to retain metrics
##
retention: 7d

## Maximum size of metrics
##
retentionSize: "150GB"

## Enable compression of the write-ahead log using Snappy.
##
walCompression: false

## If true, the Operator won't process any Prometheus configuration changes
##
paused: false

## Number of Prometheus replicas desired
##
replicas: 1

## Log level for Prometheus be configured in
##
logLevel: info

## Log format for Prometheus be configured in
##
logFormat: logfmt

## Prefix used to register routes, overriding externalUrl route.
## Useful for proxies that rewrite URLs.
##
routePrefix: /

## Standard object’s metadata. More info: https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/api-conventions.md#metadata
## Metadata Labels and Annotations gets propagated to the prometheus pods.
##
podMetadata: {}

podAntiAffinity: ""

## If anti-affinity is enabled sets the topologyKey to use for anti-affinity.
## This can be changed to, for example, failure-domain.beta.kubernetes.io/zone
##
podAntiAffinityTopologyKey: kubernetes.io/hostname

## Assign custom affinity rules to the prometheus instance
## ref: https://kubernetes.io/docs/concepts/configuration/assign-pod-node/
##
affinity: {}

remoteRead: []
# - url: http://remote1/read

## The remote_write spec configuration for Prometheus.
## ref: https://github.com/prometheus-operator/prometheus-operator/blob/master/Documentation/api.md#remotewritespec
remoteWrite: []
# - url: http://remote1/push

## Enable/Disable Grafana dashboards provisioning for prometheus remote write feature
remoteWriteDashboards: false

## Resource limits & requests
##
resources:
  requests:
    memory: 6Gi
    cpu: 1000m
  limits:
    memory: 12Gi
    cpu: 2000m

## Prometheus StorageSpec for persistent data
## ref: https://github.com/prometheus-operator/prometheus-operator/blob/master/Documentation/user-guides/storage.md
##
storageSpec:
  volumeClaimTemplate:
    spec:
      storageClassName: "managed-premium"
      accessModes: ["ReadWriteOnce"]
      resources:
        requests:
          storage: 300Gi
    # selector: {}

# Additional volumes on the output StatefulSet definition.
volumes: []
# Additional VolumeMounts on the output StatefulSet definition.
volumeMounts: []

additionalScrapeConfigs:
  - job_name: "kubernetes-pods"
    kubernetes_sd_configs:
      - role: pod
    relabel_configs:
      - source_labels:
          [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
        action: keep
        regex: true
      - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
        action: replace
        target_label: __metrics_path__
        regex: (.+)
      - source_labels:
          [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
        action: replace
        regex: ([^:]+)(?::\d+)?;(\d+)
        replacement: $1:$2
        target_label: __address__
      - action: labelmap
        regex: __meta_kubernetes_pod_label_(.+)
      - source_labels: [__meta_kubernetes_namespace]
        action: replace
        target_label: kubernetes_namespace
      - source_labels: [__meta_kubernetes_pod_name]
        action: replace
        target_label: kubernetes_pod_name

## If additional scrape configurations are already deployed in a single secret file you can use this section.
## Expected values are the secret name and key
## Cannot be used with additionalScrapeConfigs
additionalScrapeConfigsSecret: {}
  # enabled: false
  # name:
  # key:

## additionalPrometheusSecretsAnnotations allows to add annotations to the kubernetes secret. This can be useful
## when deploying via spinnaker to disable versioning on the secret, strategy.spinnaker.io/versioned: 'false'
additionalPrometheusSecretsAnnotations: {}

additionalAlertManagerConfigs: []

additionalAlertRelabelConfigs: []

securityContext:
  runAsGroup: 2000
  runAsNonRoot: true
  runAsUser: 1000
  fsGroup: 2000

##  Priority class assigned to the Pods
##
priorityClassName: ""

thanos: {}

## Containers allows injecting additional containers. This is meant to allow adding an authentication proxy to a Prometheus pod.
##  if using proxy extraContainer  update targetPort with proxy container port
containers: []

## InitContainers allows injecting additional initContainers. This is meant to allow doing some changes
## (permissions, dir tree) on mounted volumes before starting prometheus
initContainers: []

## PortName to use for Prometheus.
##
portName: "http-web"

additionalServiceMonitors: []

additionalPodMonitors: []

chand567 commented 2 months ago

is this resolved, i see similar issue