vmware-tanzu / helm-charts

Contains Helm charts for Kubernetes related open source tools
https://vmware-tanzu.github.io/helm-charts/
Apache License 2.0
246 stars 357 forks source link

Velero Service Monitor is not discovered by Prometheus Operator (kube-prometheus -stack) #609

Open darnone opened 1 month ago

darnone commented 1 month ago

What steps did you take and what happened: [A clear and concise description of what the bug is, and what commands you ran.) I have deployed velero into a namespace called velero using helmfile and I have prometheus-operator running in a namespace called monitoring. I have activated metrics with the following values:

metrics:
  enabled: true
  scrapeInterval: 30s
  scrapeTimeout: 10s

  # Pod annotations for Prometheus
  podAnnotations:
    prometheus.io/scrape: "true"
    prometheus.io/port: "8085"
    prometheus.io/path: "/metrics"

  serviceMonitor:
    autodetect: false
    enabled: true
    annotations: 
      "helm.sh/hook": post-install,post-upgrade
    namespace: monitoring

The service monitor is created in the monitoring namespace but prometheus does not see it as a target. The service monitor produced is:

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: velero
  namespace: monitoring
  annotations:
    helm.sh/hook: post-install,post-upgrade
  labels:
    app.kubernetes.io/name: velero
    app.kubernetes.io/instance: velero
    app.kubernetes.io/managed-by: Helm
    helm.sh/chart: velero-7.1.1
spec:
  namespaceSelector:
    matchNames:
      - velero
  selector:
    matchLabels:
      app.kubernetes.io/name: velero
      app.kubernetes.io/instance: velero
  endpoints:
  - port: http-monitoring
    interval: 30s
    scrapeTimeout: 10s

What did you expect to happen:

I expected prometheus to find the servicemonitor under service discovery and targets in prometheus UI and scrape metrics

The output of the following commands will help us better understand what's going on: (Pasting long output into a GitHub gist or other pastebin is fine.)

nothing appears abnormal in the prometheus or velero logs

Anything else you would like to add: [Miscellaneous information that will assist in solving the issue.]

I am using helmfile v0.166.0 The kube-prometheus-stack is working for default k8s services

Environment:

darnone commented 1 month ago

I have written to the Velero-users channel on clock but I have not received a reply.

darnone commented 1 month ago

I am attaching the velero pod and deployment velero-manifests.zip

phac008 commented 1 month ago

try adding additionalLabels to serviceMonitor values

 serviceMonitor:
    additionalLabels:
      release: kube-prometheus-stack
darnone commented 1 month ago

That did it. What led you find that fix?