open-telemetry / opentelemetry-collector-contrib

Contrib repository for the OpenTelemetry Collector
https://opentelemetry.io
Apache License 2.0
2.73k stars 2.16k forks source link

Not able to scrape metrics using prometheus receiver #32710

Open kadhamecha-conga opened 2 months ago

kadhamecha-conga commented 2 months ago

Component(s)

receiver/prometheus

What happened?

The OTel Collector which is running in daemonset mode , which is configured to scrape metrics from pods based on specific annotations, Typically like prometheus.io/scrape and prometheus.io/path to identify pods that expose metrics and the paths where the metrics are available.

i have attached to config and logs to scrape the metrics based on annotation , but scraping is not done and not able to receive the metrics.

can you please help ?

thanks.

Collector version

0.73

Environment information

EKS 1.26

OpenTelemetry Collector configuration

receivers:
  prometheus:
          config:
            scrape_configs:
            - job_name: apps
              kubernetes_sd_configs:
              - role: pod
                selectors:
                - field: spec.nodeName=$KUBE_NODE_NAME
                  role: pod
              relabel_configs:
              - action: keep
                regex: "true"
                source_labels:
                - __meta_kubernetes_pod_ready
              - action: keep
                regex: "true"
                source_labels:
                - __meta_kubernetes_pod_annotation_prometheus_io_scrape
              - action: replace
                regex: (.+)
                source_labels:
                - __meta_kubernetes_pod_annotation_prometheus_io_path
                target_label: __metrics_path__
              - action: replace
                regex: (.+)(?::\d+);(\d+)
                replacement: $$1:$$2
                source_labels:
                - __address__
                - __meta_kubernetes_pod_annotation_prometheus_io_port
                target_label: __address__
              scrape_interval: 30s

    service:
      extensions: [health_check, zpages, memory_ballast]
      telemetry:
        logs:
          level: info
      pipelines:
        traces:
          receivers: [otlp]
          processors: [memory_limiter, resource, batch]
          exporters: [otlp, logging]
        metrics:
          receivers: [prometheus,kubeletstats/infra, k8s_cluster]
          processors: [memory_limiter, k8sattributes, resource, batch]
          exporters: [otlphttp/metrics]
        logs:
          receivers: [filelog]
          processors: [memory_limiter, k8sattributes, resource/container, resource, batch]
          exporters: [otlp, logging]

Log output

info    prometheusreceiver@v0.73.0/metrics_receiver.go:243  Scrape job added    {"kind": "receiver", "name": "prometheus", "data_type": "metrics", "jobName": "opentelemetry-collector"}
info    prometheusreceiver@v0.73.0/metrics_receiver.go:243  Scrape job added    {"kind": "receiver", "name": "prometheus", "data_type": "metrics", "jobName": "apps"}
info    kubernetes/kubernetes.go:326    Using pod service account via in-cluster config {"kind": "receiver", "name": "prometheus", "data_type": "metrics", "discovery": "kubernetes", "config": "apps"}
info    prometheusreceiver@v0.73.0/metrics_receiver.go:255  Starting discovery manager  {"kind": "receiver", "name": "prometheus", "data_type": "metrics"}
info    prometheusreceiver@v0.73.0/metrics_receiver.go:289  Starting scrape manager {"kind": "receiver", "name": "prometheus", "data_type": "metrics"}

Additional context

No response

github-actions[bot] commented 2 months ago

Pinging code owners for receiver/prometheus: @Aneurysm9 @dashpole. See Adding Labels via Comments if you do not have permissions to add labels yourself.

dashpole commented 2 months ago

looks like the prometheus receiver defined above isn't used in your pipeline

kadhamecha-conga commented 2 months ago

hi it is there in metrics stage : metrics: receivers: [prometheus,kubeletstats/infra,k8s_cluster]

dashpole commented 2 months ago

Can you check that you've set the KUBE_NODE_NAME env using the downward API in your pod spec?

And I assume you are setting the annotations correctly on your pods?

dashpole commented 2 months ago

And you've granted the collector the ability to get/list/watch pods using RBAC?

kadhamecha-conga commented 2 months ago

yes , annotations are present in pod image

RBAC used by otel pod has rules as below: `rules:

kadhamecha-conga commented 2 months ago

hi @dashpole , can you please guide, if I'm missing something in config?

dashpole commented 2 months ago

The config looks correct. Are you looking at the logs from the collector which is on the same node as the application? The other thing you can try is removing some of the relabel_configs to see if any of them are the cause of the pod not being discovered.

github-actions[bot] commented 5 days ago

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.