Open jaylevin opened 2 years ago
Right now our recommendation is to install prometheus via the official helm chart:
kubectl create namespace monitoring
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm install prometheus prometheus-community/prometheus --namespace monitoring
here the configmap alraedy exists and just needs to be overwritten. Though we're having some issues with that, see #240
Do you think the operator is better suited for this? But would that mean we can no longer support "classical" prometheus on Kubernetes installations?
The following slack discussion https://keptn.slack.com/archives/CNRCGFU3U/p1643028340100100 reveals that we are not compatible with the Prometheus operator.
It seems that the names of services and pods/deployments have changed:
$ kubectl -n monitoring get all
NAME READY STATUS RESTARTS AGE
pod/alertmanager-prometheus-operator-alertmanager-0 0/2 ContainerCreating 0 10d
pod/metrics-server-85496d4f7c-djzjj 1/1 Running 0 10d
pod/prometheus-operator-grafana-588d549949-x2tg8 2/2 Running 2 59d
pod/prometheus-operator-grafana-test 0/1 Completed 0 59d
pod/prometheus-operator-kube-state-metrics-64d56fc9df-wp8wc 1/1 Running 0 10d
pod/prometheus-operator-operator-7fb8c9f85c-2nvgh 2/2 Running 0 10d
pod/prometheus-operator-prometheus-node-exporter-4flc8 1/1 Running 1 111d
pod/prometheus-operator-prometheus-node-exporter-5lw7d 1/1 Running 4 111d
pod/prometheus-operator-prometheus-node-exporter-72w6s 1/1 Running 1 111d
pod/prometheus-operator-prometheus-node-exporter-7b6p2 1/1 Running 2 111d
pod/prometheus-operator-prometheus-node-exporter-9rx2g 1/1 Running 3 111d
pod/prometheus-operator-prometheus-node-exporter-q66cl 1/1 Running 2 89d
pod/prometheus-operator-prometheus-node-exporter-rmd9j 1/1 Running 6 89d
pod/prometheus-operator-prometheus-node-exporter-s4f5d 1/1 Running 1 111d
pod/prometheus-operator-prometheus-node-exporter-v6vbz 1/1 Running 1 111d
pod/prometheus-operator-prometheus-node-exporter-vbtlh 1/1 Running 1 111d
pod/prometheus-operator-prometheus-node-exporter-x6vhm 1/1 Running 2 111d
pod/prometheus-operator-prometheus-node-exporter-xccx9 1/1 Running 1 111d
pod/prometheus-prometheus-operator-prometheus-0 3/3 Running 3 58d
pod/telegraf-daemonset-5f9cv 2/2 Running 2 111d
pod/telegraf-daemonset-7c2xn 2/2 Running 2 111d
pod/telegraf-daemonset-7gxcb 2/2 Running 2 111d
pod/telegraf-daemonset-7nfwl 2/2 Running 2 111d
pod/telegraf-daemonset-9225k 2/2 Running 4 111d
pod/telegraf-daemonset-cqjd5 2/2 Running 2 111d
pod/telegraf-daemonset-dp2hb 2/2 Running 4 111d
pod/telegraf-daemonset-fhc9m 2/2 Running 6 111d
pod/telegraf-daemonset-hjqmj 2/2 Running 12 89d
pod/telegraf-daemonset-ljsxf 2/2 Running 4 111d
pod/telegraf-daemonset-njsxx 2/2 Running 4 89d
pod/telegraf-daemonset-vhhl8 2/2 Running 2 111d
pod/telegraf-deployment-6448f95b55-gn4ph 1/1 Running 0 10d
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/alertmanager-operated ClusterIP None <none> 9093/TCP,9094/TCP,9094/UDP 111d
service/metrics-server ClusterIP 10.233.32.242 <none> 443/TCP 105d
service/prometheus-operated ClusterIP None <none> 9090/TCP 111d
service/prometheus-operator-alertmanager ClusterIP 10.233.19.107 <none> 9093/TCP 111d
service/prometheus-operator-grafana ClusterIP 10.233.42.175 <none> 80/TCP 111d
service/prometheus-operator-kube-state-metrics ClusterIP 10.233.34.37 <none> 8080/TCP 111d
service/prometheus-operator-operator ClusterIP 10.233.57.170 <none> 8080/TCP,443/TCP 111d
service/prometheus-operator-prometheus ClusterIP 10.233.35.123 <none> 9090/TCP 111d
service/prometheus-operator-prometheus-node-exporter ClusterIP 10.233.36.118 <none> 9100/TCP 111d
service/telegraf-deployment ClusterIP 10.233.37.51 <none> 9273/TCP 111d
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
daemonset.apps/prometheus-operator-prometheus-node-exporter 12 12 12 12 12 <none> 111d
daemonset.apps/telegraf-daemonset 12 12 12 12 12 <none> 111d
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/metrics-server 1/1 1 1 105d
deployment.apps/prometheus-operator-grafana 1/1 1 1 111d
deployment.apps/prometheus-operator-kube-state-metrics 1/1 1 1 111d
deployment.apps/prometheus-operator-operator 1/1 1 1 111d
deployment.apps/telegraf-deployment 1/1 1 1 111d
NAME DESIRED CURRENT READY AGE
replicaset.apps/metrics-server-85496d4f7c 1 1 1 105d
replicaset.apps/prometheus-operator-grafana-588d549949 1 1 1 59d
replicaset.apps/prometheus-operator-grafana-5c86cf65f9 0 0 0 59d
replicaset.apps/prometheus-operator-grafana-7857784dcd 0 0 0 111d
replicaset.apps/prometheus-operator-kube-state-metrics-64d56fc9df 1 1 1 111d
replicaset.apps/prometheus-operator-operator-7fb8c9f85c 1 1 1 111d
replicaset.apps/telegraf-deployment-6448f95b55 1 1 1 111d
NAME READY AGE
statefulset.apps/alertmanager-prometheus-operator-alertmanager 0/1 111d
statefulset.apps/prometheus-prometheus-operator-prometheus 1/1 111d
NAME COMPLETIONS DURATION AGE
job.batch/prometheus-operator-admission-create 1/1 5s 111d
job.batch/prometheus-operator-admission-patch 1/1 94s 111d
We are looking for
service/prometheus-server ClusterIP 10.24.45.75 <none> 80/TCP 37d
service/prometheus-alertmanager ClusterIP 10.24.32.99 <none> 80/TCP 37d
in prometheus-service, but those are not available.
@christian-kreuzberger-dtx FYI I am currently doing analysis on this, as I would like to use the operator also.
Sure! Please post your findings here! Looping in @thisthat and @oleg-nenashev on this change.
+1. I will add it to my watch list for Keptn LTS
Is anybody working on this? I would give it a try.
My recommendation for this is that folks using the operator then they can BYO their own configuration and don't use the Keptn configure monitoring. And the get sli will work. I can present it at the developer meeting
My recommendation for this is that folks using the operator then they can BYO their own configuration and don't use the Keptn configure monitoring. And the get sli will work. I can present it at the developer meeting
so what do you suggest doing?
This issue is to address the incompatibility between the Keptn prometheus-service and prometheus deployed via the Prometheus Operator
Currently, the keptn prometheus-service depends on reading/writing to both the prometheus and alert-manager
ConfigMap
that are deployed as part of the Prometheus Community helm chart. However, when Prometheus is deployed on K8s via the prometheus-operator, these ConfigMaps do not exist.Instead, (from my very limited understanding) the prometheus-operator watches for
ServiceMonitor
CRs in order to configure new scrape jobs. The prometheus-service keptn integration should ideally be able to handle the deployment of these CRs in order to create new scrape jobs for each service/project/stage that is configured to be monitored.