Closed developer1622 closed 2 weeks ago
Pinging code owners:
receiver/prometheus: @Aneurysm9 @dashpole
See Adding Labels via Comments if you do not have permissions to add labels yourself.
In case my standard Prometheus deployment (attached YAML below) I can see below target to build, but in OTel Prometheus receiver throwing errors.
apiVersion: v1
kind: ConfigMap
metadata:
name: prometheus-config
labels:
app: prometheus
data:
prometheus.yml: |
global:
scrape_interval: 2m
evaluation_interval: 2m
scrape_configs:
- job_name: etcd
scheme: https
kubernetes_sd_configs:
- role: pod
relabel_configs:
# Keep only etcd pods in the kube-system namespace
- action: keep
source_labels: [__meta_kubernetes_namespace, __meta_kubernetes_pod_name]
separator: /
regex: "kube-system/etcd.+"
# Replace the address to use the pod IP with port 2379
- source_labels: [__meta_kubernetes_pod_ip]
action: replace
target_label: __address__
regex: (.*)
replacement: $1:2379
tls_config:
insecure_skip_verify: true
ca_file: /etc/etcd/ca.crt
cert_file: /etc/etcd/server.crt
key_file: /etc/etcd/server.key
- job_name: kube-controller-manager
honor_labels: true
kubernetes_sd_configs:
- role: pod
namespaces:
names:
- kube-system
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
scheme: https
tls_config:
insecure_skip_verify: true
relabel_configs:
# Keep pods with the specified labels
- source_labels: [__meta_kubernetes_pod_label_component, __meta_kubernetes_pod_label_tier]
action: keep
regex: kube-controller-manager;control-plane
# Replace the address to use the pod IP with port 10257
- source_labels: [__meta_kubernetes_pod_ip]
action: replace
target_label: __address__
regex: (.*)
replacement: $1:10257
- job_name: kube-scheduler
honor_labels: true
kubernetes_sd_configs:
- role: pod
namespaces:
names:
- kube-system
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
scheme: https
tls_config:
insecure_skip_verify: true
relabel_configs:
# Keep pods with the specified labels
- source_labels: [__meta_kubernetes_pod_label_component, __meta_kubernetes_pod_label_tier]
action: keep
regex: kube-scheduler;control-plane
# Replace the address to use the pod IP with port 10250
- source_labels: [__meta_kubernetes_pod_ip]
action: replace
target_label: __address__
regex: (.*)
replacement: $1:10259
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: prometheus-deployment
spec:
replicas: 1
selector:
matchLabels:
app: prometheus
template:
metadata:
labels:
app: prometheus
spec:
containers:
- name: prometheus-cont
image: prom/prometheus
volumeMounts:
- name: config-volume
mountPath: /etc/prometheus/prometheus.yml
subPath: prometheus.yml
- mountPath: /etc/etcd
name: etcd-certs
ports:
- containerPort: 9090
volumes:
- name: config-volume
configMap:
name: prometheus-config
- configMap:
name: etcd-certs
name: etcd-certs
hostNetwork: true
serviceAccount: otelcontribcol
serviceAccountName: otelcontribcol
---
kind: Service
apiVersion: v1
metadata:
name: prometheus-service
spec:
selector:
app: prometheus
ports:
- name: promui
nodePort: 30900
protocol: TCP
port: 9090
targetPort: 9090
type: NodePort
Thank you.
I skimmed the issue, so apologize if I missed this. The otel collector interprets $1 as the environment variable. You need to escape it with $$1
LMK if that was your issue, or if I misread
Hi @dashpole.
Thank you very much for the response; it worked after using 2 dollars($). You saved actually.
So, whatever works in standard Prometheus needs tweaking for running in OTel Prometheus receiver?
Is the deviation in OTel from the standard Prometheus scrape config something architecturally specific that end users need to know?
Thank you.
It exists because the promethues server config doesn't support environment variables, but the otel collector does.
Hi @dashpole , I have forgot to ask one query, thank you
I have the below kube-scheduler scrape config which is working fine
- job_name: kube-scheduler
honor_labels: true
kubernetes_sd_configs:
- role: pod
namespaces:
names:
- kube-system
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
scheme: https
tls_config:
insecure_skip_verify: true
relabel_configs:
# Keep pods with the specified labels
- source_labels:
[
__meta_kubernetes_pod_label_component,
__meta_kubernetes_pod_label_tier,
]
action: keep
regex: kube-scheduler;control-plane
# Replace the address to use the pod IP with port 10250
- source_labels: [__meta_kubernetes_pod_ip]
action: replace
target_label: __address__
regex: (.*)
replacement: $$1:10259
So, with this configuration, I am able to see only one instance scheduler metrics ( I have 3 control plane nodes, so that means I have 3 schedulers)
Is this expected behaviour in multi control plane (multi-master) K8s clusters? other scrape configs(other 2 control plane schedulers) are failing, but one of them is successful
Here is the sample log of 2 instances failing
Failed to scrape Prometheus endpoint {"kind": "receiver", "name": "prometheus", "data_type": "metrics", "scrape_timestamp": 1722538620170, "target_labels": "{name=\"up\", instance=\"
Failed to scrape Prometheus endpoint {"kind": "receiver", "name": "prometheus", "data_type": "metrics", "scrape_timestamp": 1722538620179, "target_labels": "{name=\"up\", instance=\"
Thank you.
I would expect 3 metrics. Try raising the logging verbosity to DEBUG to see the detailed scrape failure reason
Component(s)
receiver/prometheus
What happened?
Description
Please bear with me for descriptive error message, however actually it is short.
I am trying to scrape prometheus metrics for the etcd, kube-scheduler and kube-controller. However, it resulting in the error , i have tried multiple relabel configurations to get the end URL address coreect, however it is still not working
I execed into pod and used curl to scrape respective pod ip targets, all worked but with scraping config, it is not working.
Steps to Reproduce
Keep the following scrape config in under receiver section of receiver
Expected Result
We usually see the metrics wherver being exported
Actual Result
I have tried multiple ,so I have got multiple errors, I will post all of them here
Second
and third
Seems like it is not building the complete URL , which we can see below error , instance for all 3 components
Collector version
latest image of contrib. here: otel/opentelemetry-collector-contrib:latest
Environment information
Environment
OS: (e.g., "Ubuntu 20.04") Compiler(if manually compiled): (e.g., "go 22") it is mult-control plane K8s cluster I have 3 control plane nodes
so, I have 3 etcd services, 3 kube-control-mangers and 3 kube-schedulers
OpenTelemetry Collector configuration
Log output
Additional context
it is mult-control plane K8s cluster
Thank you, I have tried to build target, seems like it is not if my scraping is not correct, please give correct scraping config for 3 K8s components
Here are my pods of all 3 comoponents ,
Thank you