Closed NewmanJ1987 closed 2 years ago
So, did you actually do kubectl create secret generic additional-scrape-configs --from-file=prometheus-additional.yaml -n monitoring
? Or just kubectl apply -f prometheus-additional.yaml
? BEcause assuming you used 0.26, the file it self is secret: https://github.com/strimzi/strimzi-kafka-operator/blob/main/examples/metrics/prometheus-additional-properties/prometheus-additional.yaml ... so you do not create a secret from it
I did this sir.
kubectl create secret generic additional-scrape-configs --from-file=prometheus-additional.yaml -n monitoring
I was following this guide. https://snourian.com/kafka-kubernetes-strimzi-part-3-monitoring-strimzi-kafka-with-prometheus-grafana/
Ok, so can you try kubectl apply -f prometheus-additional.yaml -n monitoring
to see if it helps?
kubectl create secret generic additional-scrape-configs --from-file=prometheus-additional.yaml -n monitoring
would create a secret inside a secret and Prometheus would not understand it. I guess this might have changed since the blog post.
Ok so this error goes away but I don't know if its working correclty. I can't see any information once I installed the grafana dashboards.
Well, I'm not an expert on Prometheus ... but you can check:
The kafka mycluster pod is missing the "prometheus.io/scrape" == "true"
annotation
We don't set any annotations since every user uses different Prometheus configuration and hardcoded annotations are cause of issues. If you installed the PodMonitors we provide, the Prometheus operator should configure your Prometheus to scrape it without any special annotations.
But if you need the annotation, you can set it via the Kafka custom resource: https://strimzi.io/docs/operators/latest/full/using.html#assembly-customizing-kubernetes-resources-str
I think I made some progress I updated with Pod Monitor and Pod rules like so.
kubectl apply -f strimzi-pod-monitor.yaml -n monitoring
kubectl apply -f prometheus-rules.yaml -n monitoring;
I am able to see some metrics now on one grafana dashboard (Strimzi Operators) but I still see no data in one dashboard (Strimzi Kafka Exporter). Any idea why this may be the case ?
I guess that suggests that it now scrapes some pods, but not all of them.
Should I restart the pods ?
Yeah strange some of the dashboards show data and the others do not. I am basically just applying the the yaml from your examples/metrics. Do you know a way of checking the errors that can be thrown by the PodMonitors ?
That would be somewhere in the Prometheus Operator. But I have no experience with debugging it I'm afraid.
Note the Cluster name
and Namespace
in the Kafka Exporter dashboard are not set. Not having these values results in no data in the rest of the graphs. You should try to reload the page to see if it gets fetched by grafana properly. If not, you should check the kafka_exporter_build_info
metric in the Prometheus UI. This metric should be present and have the labels set. If not, there is some issue with scrapping the metrics from Kafka Exporter.
// edit
Also, I remember there is some issue with Kafka Exporter. It does not emit some metrics when there is no traffic in the Kafka cluster. Is that the case?
Hi,
As per your instruction I created a topic and I sent some traffic, unfortunately I still see no data.
The kafka_exporter_build_info
label is missing and when I looked at the logs for the kafka Exporter I see this
[sarama] 2021/11/11 17:03:28 Closed connection to broker my-cluster-kafka-0.my-cluster-kafka-brokers.kafka.svc:9091
[sarama] 2021/11/11 17:03:37 client/metadata fetching metadata for all topics from broker my-cluster-kafka-bootstrap:9091
I1111 17:03:37.989779 11 kafka_exporter.go:366] Refreshing client metadata
[sarama] 2021/11/11 17:03:37 Connected to broker at my-cluster-kafka-0.my-cluster-kafka-brokers.kafka.svc:9091 (registered as #0)
I1111 17:03:38.069357 11 kafka_exporter.go:637] Fetching consumer group metrics
[sarama] 2021/11/11 17:03:38 Closed connection to broker my-cluster-kafka-0.my-cluster-kafka-brokers.kafka.svc:9091
[sarama] 2021/11/11 17:03:43 client/metadata fetching metadata for all topics from broker my-cluster-kafka-bootstrap:9091
[sarama] 2021/11/11 17:03:47 Connected to broker at my-cluster-kafka-0.my-cluster-kafka-brokers.kafka.svc:9091 (registered as #0)
I1111 17:03:48.075577 11 kafka_exporter.go:637] Fetching consumer group metrics
[sarama] 2021/11/11 17:03:48 Closed connection to broker my-cluster-kafka-0.my-cluster-kafka-brokers.kafka.svc:9091
I don't see any obvious errors. Can you point me in the right direction ?
Missing the kafka_exporter_build_info
metrics could be caused by incorrectly configured Prometheus scrapping. Can you share the config from the Prometheus UI? You shou be able to find it under Status/Configuration.
Sure. It is the default one from the examples.
global:
scrape_interval: 30s
scrape_timeout: 10s
evaluation_interval: 30s
external_labels:
prometheus: monitoring/prometheus
prometheus_replica: prometheus-prometheus-0
alerting:
alert_relabel_configs:
- separator: ;
regex: prometheus_replica
replacement: $1
action: labeldrop
alertmanagers:
- follow_redirects: true
scheme: http
path_prefix: /
timeout: 10s
api_version: v2
relabel_configs:
- source_labels: [__meta_kubernetes_service_name]
separator: ;
regex: alertmanager
replacement: $1
action: keep
- source_labels: [__meta_kubernetes_endpoint_port_name]
separator: ;
regex: alertmanager
replacement: $1
action: keep
kubernetes_sd_configs:
- role: endpoints
kubeconfig_file: ""
follow_redirects: true
namespaces:
names:
- monitoring
rule_files:
- /etc/prometheus/rules/prometheus-prometheus-rulefiles-0/*.yaml
scrape_configs:
- job_name: podMonitor/monitoring/bridge-metrics/0
honor_timestamps: true
scrape_interval: 30s
scrape_timeout: 10s
metrics_path: /metrics
scheme: http
follow_redirects: true
relabel_configs:
- source_labels: [job]
separator: ;
regex: (.*)
target_label: __tmp_prometheus_job_name
replacement: $1
action: replace
- source_labels: [__meta_kubernetes_pod_label_strimzi_io_kind, __meta_kubernetes_pod_labelpresent_strimzi_io_kind]
separator: ;
regex: KafkaBridge;true
replacement: $1
action: keep
- source_labels: [__meta_kubernetes_pod_container_port_name]
separator: ;
regex: rest-api
replacement: $1
action: keep
- source_labels: [__meta_kubernetes_namespace]
separator: ;
regex: (.*)
target_label: namespace
replacement: $1
action: replace
- source_labels: [__meta_kubernetes_pod_container_name]
separator: ;
regex: (.*)
target_label: container
replacement: $1
action: replace
- source_labels: [__meta_kubernetes_pod_name]
separator: ;
regex: (.*)
target_label: pod
replacement: $1
action: replace
- separator: ;
regex: (.*)
target_label: job
replacement: monitoring/bridge-metrics
action: replace
- separator: ;
regex: (.*)
target_label: endpoint
replacement: rest-api
action: replace
- source_labels: [__address__]
separator: ;
regex: (.*)
modulus: 1
target_label: __tmp_hash
replacement: $1
action: hashmod
- source_labels: [__tmp_hash]
separator: ;
regex: "0"
replacement: $1
action: keep
kubernetes_sd_configs:
- role: pod
kubeconfig_file: ""
follow_redirects: true
namespaces:
names:
- kafka
- job_name: podMonitor/monitoring/cluster-operator-metrics/0
honor_timestamps: true
scrape_interval: 30s
scrape_timeout: 10s
metrics_path: /metrics
scheme: http
follow_redirects: true
relabel_configs:
- source_labels: [job]
separator: ;
regex: (.*)
target_label: __tmp_prometheus_job_name
replacement: $1
action: replace
- source_labels: [__meta_kubernetes_pod_label_strimzi_io_kind, __meta_kubernetes_pod_labelpresent_strimzi_io_kind]
separator: ;
regex: cluster-operator;true
replacement: $1
action: keep
- source_labels: [__meta_kubernetes_pod_container_port_name]
separator: ;
regex: http
replacement: $1
action: keep
- source_labels: [__meta_kubernetes_namespace]
separator: ;
regex: (.*)
target_label: namespace
replacement: $1
action: replace
- source_labels: [__meta_kubernetes_pod_container_name]
separator: ;
regex: (.*)
target_label: container
replacement: $1
action: replace
- source_labels: [__meta_kubernetes_pod_name]
separator: ;
regex: (.*)
target_label: pod
replacement: $1
action: replace
- separator: ;
regex: (.*)
target_label: job
replacement: monitoring/cluster-operator-metrics
action: replace
- separator: ;
regex: (.*)
target_label: endpoint
replacement: http
action: replace
- source_labels: [__address__]
separator: ;
regex: (.*)
modulus: 1
target_label: __tmp_hash
replacement: $1
action: hashmod
- source_labels: [__tmp_hash]
separator: ;
regex: "0"
replacement: $1
action: keep
kubernetes_sd_configs:
- role: pod
kubeconfig_file: ""
follow_redirects: true
namespaces:
names:
- kafka
- job_name: podMonitor/monitoring/entity-operator-metrics/0
honor_timestamps: true
scrape_interval: 30s
scrape_timeout: 10s
metrics_path: /metrics
scheme: http
follow_redirects: true
relabel_configs:
- source_labels: [job]
separator: ;
regex: (.*)
target_label: __tmp_prometheus_job_name
replacement: $1
action: replace
- source_labels: [__meta_kubernetes_pod_label_app_kubernetes_io_name, __meta_kubernetes_pod_labelpresent_app_kubernetes_io_name]
separator: ;
regex: entity-operator;true
replacement: $1
action: keep
- source_labels: [__meta_kubernetes_pod_container_port_name]
separator: ;
regex: healthcheck
replacement: $1
action: keep
- source_labels: [__meta_kubernetes_namespace]
separator: ;
regex: (.*)
target_label: namespace
replacement: $1
action: replace
- source_labels: [__meta_kubernetes_pod_container_name]
separator: ;
regex: (.*)
target_label: container
replacement: $1
action: replace
- source_labels: [__meta_kubernetes_pod_name]
separator: ;
regex: (.*)
target_label: pod
replacement: $1
action: replace
- separator: ;
regex: (.*)
target_label: job
replacement: monitoring/entity-operator-metrics
action: replace
- separator: ;
regex: (.*)
target_label: endpoint
replacement: healthcheck
action: replace
- source_labels: [__address__]
separator: ;
regex: (.*)
modulus: 1
target_label: __tmp_hash
replacement: $1
action: hashmod
- source_labels: [__tmp_hash]
separator: ;
regex: "0"
replacement: $1
action: keep
kubernetes_sd_configs:
- role: pod
kubeconfig_file: ""
follow_redirects: true
namespaces:
names:
- kafka
- job_name: podMonitor/monitoring/kafka-resources-metrics/0
honor_timestamps: true
scrape_interval: 30s
scrape_timeout: 10s
metrics_path: /metrics
scheme: http
follow_redirects: true
relabel_configs:
- source_labels: [job]
separator: ;
regex: (.*)
target_label: __tmp_prometheus_job_name
replacement: $1
action: replace
- source_labels: [__meta_kubernetes_pod_label_strimzi_io_kind, __meta_kubernetes_pod_labelpresent_strimzi_io_kind]
separator: ;
regex: Kafka|KafkaConnect|KafkaMirrorMaker|KafkaMirrorMaker2;true
replacement: $1
action: keep
- source_labels: [__meta_kubernetes_pod_container_port_name]
separator: ;
regex: tcp-prometheus
replacement: $1
action: keep
- source_labels: [__meta_kubernetes_namespace]
separator: ;
regex: (.*)
target_label: namespace
replacement: $1
action: replace
- source_labels: [__meta_kubernetes_pod_container_name]
separator: ;
regex: (.*)
target_label: container
replacement: $1
action: replace
- source_labels: [__meta_kubernetes_pod_name]
separator: ;
regex: (.*)
target_label: pod
replacement: $1
action: replace
- separator: ;
regex: (.*)
target_label: job
replacement: monitoring/kafka-resources-metrics
action: replace
- separator: ;
regex: (.*)
target_label: endpoint
replacement: tcp-prometheus
action: replace
- separator: ;
regex: __meta_kubernetes_pod_label_(strimzi_io_.+)
replacement: $1
action: labelmap
- source_labels: [__meta_kubernetes_namespace]
separator: ;
regex: (.*)
target_label: namespace
replacement: $1
action: replace
- source_labels: [__meta_kubernetes_pod_name]
separator: ;
regex: (.*)
target_label: kubernetes_pod_name
replacement: $1
action: replace
- source_labels: [__meta_kubernetes_pod_node_name]
separator: ;
regex: (.*)
target_label: node_name
replacement: $1
action: replace
- source_labels: [__meta_kubernetes_pod_host_ip]
separator: ;
regex: (.*)
target_label: node_ip
replacement: $1
action: replace
- source_labels: [__address__]
separator: ;
regex: (.*)
modulus: 1
target_label: __tmp_hash
replacement: $1
action: hashmod
- source_labels: [__tmp_hash]
separator: ;
regex: "0"
replacement: $1
action: keep
kubernetes_sd_configs:
- role: pod
kubeconfig_file: ""
follow_redirects: true
- job_name: kubernetes-cadvisor
honor_labels: true
honor_timestamps: true
scrape_interval: 10s
scrape_timeout: 10s
metrics_path: /metrics/cadvisor
scheme: https
authorization:
type: Bearer
credentials_file: /var/run/secrets/kubernetes.io/serviceaccount/token
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
insecure_skip_verify: true
follow_redirects: true
relabel_configs:
- separator: ;
regex: __meta_kubernetes_node_label_(.+)
replacement: $1
action: labelmap
- separator: ;
regex: (.*)
target_label: __address__
replacement: kubernetes.default.svc:443
action: replace
- source_labels: [__meta_kubernetes_node_name]
separator: ;
regex: (.+)
target_label: __metrics_path__
replacement: /api/v1/nodes/${1}/proxy/metrics/cadvisor
action: replace
- source_labels: [__meta_kubernetes_node_name]
separator: ;
regex: (.*)
target_label: node_name
replacement: $1
action: replace
- source_labels: [__meta_kubernetes_node_address_InternalIP]
separator: ;
regex: (.*)
target_label: node_ip
replacement: $1
action: replace
metric_relabel_configs:
- source_labels: [container, __name__]
separator: ;
regex: POD;container_(network).*
target_label: container
replacement: $1
action: replace
- source_labels: [container]
separator: ;
regex: POD
replacement: $1
action: drop
- source_labels: [container]
separator: ;
regex: ^$
replacement: $1
action: drop
- source_labels: [__name__]
separator: ;
regex: container_(network_tcp_usage_total|tasks_state|memory_failures_total|network_udp_usage_total)
replacement: $1
action: drop
kubernetes_sd_configs:
- role: node
kubeconfig_file: ""
follow_redirects: true
namespaces:
names:
- kafka
- monitoring
- job_name: kubernetes-nodes-kubelet
honor_timestamps: true
scrape_interval: 10s
scrape_timeout: 10s
metrics_path: /metrics
scheme: https
authorization:
type: Bearer
credentials_file: /var/run/secrets/kubernetes.io/serviceaccount/token
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
insecure_skip_verify: true
follow_redirects: true
relabel_configs:
- separator: ;
regex: __meta_kubernetes_node_label_(.+)
replacement: $1
action: labelmap
- separator: ;
regex: (.*)
target_label: __address__
replacement: kubernetes.default.svc:443
action: replace
- source_labels: [__meta_kubernetes_node_name]
separator: ;
regex: (.+)
target_label: __metrics_path__
replacement: /api/v1/nodes/${1}/proxy/metrics
action: replace
kubernetes_sd_configs:
- role: node
kubeconfig_file: ""
follow_redirects: true
namespaces:
names:
- kafka
- monitoring
IIUC you have kafka
namespace with running kafka cluster and monitoring
namespace for the monitoring stack. Could you double-check that you have namespaces configured correctly? What about other metrics (not related to KafkaExporter), can you get them?
Yes. I can get the metrics from the Strimzi Operator and if I create a service and another job for prometheus I can get the stats for the Kafka Exporter as well.
apiVersion: v1
kind: Service
metadata:
name: kafka-exporter-service
spec:
selector:
app.kubernetes.io/name: "kafka-exporter"
ports:
- protocol: TCP
port: 9404
targetPort: 9404
Added this job in prometheus-additional.yaml
- job_name: kafka-exporter
scrape_interval: 10s
scrape_timeout: 10s
static_configs:
- targets: ["kafka-exporter-service.kafka:9404"]
I re-created your topology and I was able to get the KafkaExpoter metrics without any issue. The configuration generated by Prometheus
global:
scrape_interval: 30s
scrape_timeout: 10s
evaluation_interval: 30s
external_labels:
prometheus: metrics/prometheus
prometheus_replica: prometheus-prometheus-0
alerting:
alert_relabel_configs:
- separator: ;
regex: prometheus_replica
replacement: $1
action: labeldrop
alertmanagers:
- follow_redirects: true
scheme: http
path_prefix: /
timeout: 10s
api_version: v2
relabel_configs:
- source_labels: [__meta_kubernetes_service_name]
separator: ;
regex: alertmanager
replacement: $1
action: keep
- source_labels: [__meta_kubernetes_endpoint_port_name]
separator: ;
regex: alertmanager
replacement: $1
action: keep
kubernetes_sd_configs:
- role: endpoints
kubeconfig_file: ""
follow_redirects: true
namespaces:
names:
- metrics
rule_files:
- /etc/prometheus/rules/prometheus-prometheus-rulefiles-0/*.yaml
scrape_configs:
- job_name: podMonitor/metrics/bridge-metrics/0
honor_timestamps: true
scrape_interval: 30s
scrape_timeout: 10s
metrics_path: /metrics
scheme: http
follow_redirects: true
relabel_configs:
- source_labels: [job]
separator: ;
regex: (.*)
target_label: __tmp_prometheus_job_name
replacement: $1
action: replace
- source_labels: [__meta_kubernetes_pod_label_strimzi_io_kind]
separator: ;
regex: KafkaBridge
replacement: $1
action: keep
- source_labels: [__meta_kubernetes_pod_container_port_name]
separator: ;
regex: rest-api
replacement: $1
action: keep
- source_labels: [__meta_kubernetes_namespace]
separator: ;
regex: (.*)
target_label: namespace
replacement: $1
action: replace
- source_labels: [__meta_kubernetes_pod_container_name]
separator: ;
regex: (.*)
target_label: container
replacement: $1
action: replace
- source_labels: [__meta_kubernetes_pod_name]
separator: ;
regex: (.*)
target_label: pod
replacement: $1
action: replace
- separator: ;
regex: (.*)
target_label: job
replacement: metrics/bridge-metrics
action: replace
- separator: ;
regex: (.*)
target_label: endpoint
replacement: rest-api
action: replace
- source_labels: [__address__]
separator: ;
regex: (.*)
modulus: 1
target_label: __tmp_hash
replacement: $1
action: hashmod
- source_labels: [__tmp_hash]
separator: ;
regex: "0"
replacement: $1
action: keep
kubernetes_sd_configs:
- role: pod
kubeconfig_file: ""
follow_redirects: true
namespaces:
names:
- kafka
- job_name: podMonitor/metrics/cluster-operator-metrics/0
honor_timestamps: true
scrape_interval: 30s
scrape_timeout: 10s
metrics_path: /metrics
scheme: http
follow_redirects: true
relabel_configs:
- source_labels: [job]
separator: ;
regex: (.*)
target_label: __tmp_prometheus_job_name
replacement: $1
action: replace
- source_labels: [__meta_kubernetes_pod_label_strimzi_io_kind]
separator: ;
regex: cluster-operator
replacement: $1
action: keep
- source_labels: [__meta_kubernetes_pod_container_port_name]
separator: ;
regex: http
replacement: $1
action: keep
- source_labels: [__meta_kubernetes_namespace]
separator: ;
regex: (.*)
target_label: namespace
replacement: $1
action: replace
- source_labels: [__meta_kubernetes_pod_container_name]
separator: ;
regex: (.*)
target_label: container
replacement: $1
action: replace
- source_labels: [__meta_kubernetes_pod_name]
separator: ;
regex: (.*)
target_label: pod
replacement: $1
action: replace
- separator: ;
regex: (.*)
target_label: job
replacement: metrics/cluster-operator-metrics
action: replace
- separator: ;
regex: (.*)
target_label: endpoint
replacement: http
action: replace
- source_labels: [__address__]
separator: ;
regex: (.*)
modulus: 1
target_label: __tmp_hash
replacement: $1
action: hashmod
- source_labels: [__tmp_hash]
separator: ;
regex: "0"
replacement: $1
action: keep
kubernetes_sd_configs:
- role: pod
kubeconfig_file: ""
follow_redirects: true
namespaces:
names:
- kafka
- job_name: podMonitor/metrics/entity-operator-metrics/0
honor_timestamps: true
scrape_interval: 30s
scrape_timeout: 10s
metrics_path: /metrics
scheme: http
follow_redirects: true
relabel_configs:
- source_labels: [job]
separator: ;
regex: (.*)
target_label: __tmp_prometheus_job_name
replacement: $1
action: replace
- source_labels: [__meta_kubernetes_pod_label_app_kubernetes_io_name]
separator: ;
regex: entity-operator
replacement: $1
action: keep
- source_labels: [__meta_kubernetes_pod_container_port_name]
separator: ;
regex: healthcheck
replacement: $1
action: keep
- source_labels: [__meta_kubernetes_namespace]
separator: ;
regex: (.*)
target_label: namespace
replacement: $1
action: replace
- source_labels: [__meta_kubernetes_pod_container_name]
separator: ;
regex: (.*)
target_label: container
replacement: $1
action: replace
- source_labels: [__meta_kubernetes_pod_name]
separator: ;
regex: (.*)
target_label: pod
replacement: $1
action: replace
- separator: ;
regex: (.*)
target_label: job
replacement: metrics/entity-operator-metrics
action: replace
- separator: ;
regex: (.*)
target_label: endpoint
replacement: healthcheck
action: replace
- source_labels: [__address__]
separator: ;
regex: (.*)
modulus: 1
target_label: __tmp_hash
replacement: $1
action: hashmod
- source_labels: [__tmp_hash]
separator: ;
regex: "0"
replacement: $1
action: keep
kubernetes_sd_configs:
- role: pod
kubeconfig_file: ""
follow_redirects: true
namespaces:
names:
- kafka
- job_name: podMonitor/metrics/kafka-resources-metrics/0
honor_timestamps: true
scrape_interval: 30s
scrape_timeout: 10s
metrics_path: /metrics
scheme: http
follow_redirects: true
relabel_configs:
- source_labels: [job]
separator: ;
regex: (.*)
target_label: __tmp_prometheus_job_name
replacement: $1
action: replace
- source_labels: [__meta_kubernetes_pod_label_strimzi_io_kind]
separator: ;
regex: Kafka|KafkaConnect|KafkaMirrorMaker|KafkaMirrorMaker2
replacement: $1
action: keep
- source_labels: [__meta_kubernetes_pod_container_port_name]
separator: ;
regex: tcp-prometheus
replacement: $1
action: keep
- source_labels: [__meta_kubernetes_namespace]
separator: ;
regex: (.*)
target_label: namespace
replacement: $1
action: replace
- source_labels: [__meta_kubernetes_pod_container_name]
separator: ;
regex: (.*)
target_label: container
replacement: $1
action: replace
- source_labels: [__meta_kubernetes_pod_name]
separator: ;
regex: (.*)
target_label: pod
replacement: $1
action: replace
- separator: ;
regex: (.*)
target_label: job
replacement: metrics/kafka-resources-metrics
action: replace
- separator: ;
regex: (.*)
target_label: endpoint
replacement: tcp-prometheus
action: replace
- separator: ;
regex: __meta_kubernetes_pod_label_(strimzi_io_.+)
replacement: $1
action: labelmap
- source_labels: [__meta_kubernetes_namespace]
separator: ;
regex: (.*)
target_label: namespace
replacement: $1
action: replace
- source_labels: [__meta_kubernetes_pod_name]
separator: ;
regex: (.*)
target_label: kubernetes_pod_name
replacement: $1
action: replace
- source_labels: [__meta_kubernetes_pod_node_name]
separator: ;
regex: (.*)
target_label: node_name
replacement: $1
action: replace
- source_labels: [__meta_kubernetes_pod_host_ip]
separator: ;
regex: (.*)
target_label: node_ip
replacement: $1
action: replace
- source_labels: [__address__]
separator: ;
regex: (.*)
modulus: 1
target_label: __tmp_hash
replacement: $1
action: hashmod
- source_labels: [__tmp_hash]
separator: ;
regex: "0"
replacement: $1
action: keep
kubernetes_sd_configs:
- role: pod
kubeconfig_file: ""
follow_redirects: true
namespaces:
names:
- kafka
- job_name: kubernetes-cadvisor
honor_labels: true
honor_timestamps: true
scrape_interval: 10s
scrape_timeout: 10s
metrics_path: /metrics/cadvisor
scheme: https
authorization:
type: Bearer
credentials_file: /var/run/secrets/kubernetes.io/serviceaccount/token
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
insecure_skip_verify: true
follow_redirects: true
relabel_configs:
- separator: ;
regex: __meta_kubernetes_node_label_(.+)
replacement: $1
action: labelmap
- separator: ;
regex: (.*)
target_label: __address__
replacement: kubernetes.default.svc:443
action: replace
- source_labels: [__meta_kubernetes_node_name]
separator: ;
regex: (.+)
target_label: __metrics_path__
replacement: /api/v1/nodes/${1}/proxy/metrics/cadvisor
action: replace
- source_labels: [__meta_kubernetes_node_name]
separator: ;
regex: (.*)
target_label: node_name
replacement: $1
action: replace
- source_labels: [__meta_kubernetes_node_address_InternalIP]
separator: ;
regex: (.*)
target_label: node_ip
replacement: $1
action: replace
metric_relabel_configs:
- source_labels: [container, __name__]
separator: ;
regex: POD;container_(network).*
target_label: container
replacement: $1
action: replace
- source_labels: [container]
separator: ;
regex: POD
replacement: $1
action: drop
- source_labels: [container]
separator: ;
regex: ^$
replacement: $1
action: drop
- source_labels: [__name__]
separator: ;
regex: container_(network_tcp_usage_total|tasks_state|memory_failures_total|network_udp_usage_total)
replacement: $1
action: drop
kubernetes_sd_configs:
- role: node
kubeconfig_file: ""
follow_redirects: true
- job_name: kubernetes-nodes-kubelet
honor_timestamps: true
scrape_interval: 10s
scrape_timeout: 10s
metrics_path: /metrics
scheme: https
authorization:
type: Bearer
credentials_file: /var/run/secrets/kubernetes.io/serviceaccount/token
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
insecure_skip_verify: true
follow_redirects: true
relabel_configs:
- separator: ;
regex: __meta_kubernetes_node_label_(.+)
replacement: $1
action: labelmap
- separator: ;
regex: (.*)
target_label: __address__
replacement: kubernetes.default.svc:443
action: replace
- source_labels: [__meta_kubernetes_node_name]
separator: ;
regex: (.+)
target_label: __metrics_path__
replacement: /api/v1/nodes/${1}/proxy/metrics
action: replace
kubernetes_sd_configs:
- role: node
kubeconfig_file: ""
follow_redirects: true
Note there are some differences between your and mine config. Please do compare them. I think your job_name: podMonitor/monitoring/kafka-resources-metrics/0
is missing the namespace selector. That could be the issue.
Hi @NewmanJ1987, could you resolve this issue? I have the same problems with the same symptoms..
Ok, so can you try
kubectl apply -f prometheus-additional.yaml -n monitoring
to see if it helps?
kubectl create secret generic additional-scrape-configs --from-file=prometheus-additional.yaml -n monitoring
would create a secret inside a secret and Prometheus would not understand it. I guess this might have changed since the blog post.
It worked. You save my day.
Describe the bug I am trying to get some Kafka metrics running locally with Prometheus and Grafana. I'm getting stuck at this point when trying to apply the
prometheus-additional.yaml
file. This is the bugTo Reproduce Steps to reproduce the behavior: 1) I create some namespaces
k create ns kafka
k create ns monitoring
Create Kafka cluster. Modifed the examples/kafka-metrics.yaml file slightly (will attach the file at the bottom)
k apply -f kafka-metrics.yaml -n kafka
Setup Prometheus Operator
curl -s https://raw.githubusercontent.com/prometheus-operator/prometheus-operator/master/bundle.yaml > bundle.yaml
I modified bundle.yaml and changed the namespace to monitoringkubectl apply -f bundle.yaml -n monitoring
Create a secret for the additional-scrape-configs
kubectl create secret generic additional-scrape-configs --from-file=prometheus-additional.yaml -n monitoring
Modified the following file prometheus.yaml to update the namespace to monitoring and then applied them as well
Expected behavior The secret additional-scrape-configs to be parsed correctly.
Environment (please complete the following information):
YAML files and logs This is the only file that I modified everything else is the generic file in the examples directory I just changed the namespace.
To easily collect all YAMLs and logs, you can use our report script which will automatically collect all files and prepare a ZIP archive which can be easily attached to this issue. The usage of this script is:
./report.sh [--namespace <string>] [--cluster <string>]
report-10-11-2021_12-23-09.zip