open-telemetry / opentelemetry-collector-contrib

Contrib repository for the OpenTelemetry Collector
https://opentelemetry.io
Apache License 2.0
3.02k stars 2.33k forks source link

Opentelemetry does not discover target with service monitor #17145

Closed vivekcusat closed 1 year ago

vivekcusat commented 1 year ago

What are you trying to achieve? I have configured prometheus receiver to scrap the metrics the service discovery start collecting the metrics and sending to my exporter but all the metrics coming having no-namespace tag. It is not able to identify the app from where it is coming. I have define the service monitor to label it but it does not work .

What did you expect to see? Metrics should come with label public-web-api-dev. Additional context.

Add any other context about the problem here. If you followed an existing documentation, please share the link to it. Please find my config and service monitor configuration.

apiVersion: opentelemetry.io/v1alpha1
kind: OpenTelemetryCollector
metadata:
  name: my-collector
spec:
  mode: deployment # This configuration is omittable.
  config: |
    exporters:
      datadog:
        api:
          key: ""
          site: "datadoghq.eu"

    extensions:
      health_check: {}
      memory_ballast: {}
      zpages: {}
    processors:
      batch: 
        send_batch_max_size: 1000
        send_batch_size: 100
        timeout: 10s

    receivers:
      otlp:
        protocols:
          http:
            endpoint: 0.0.0.0:4318
          grpc:
            endpoint: 0.0.0.0:4317

      prometheus:
        config:
          scrape_configs:
            - job_name: public-web-api-dev
              scrape_interval: 60s
              kubernetes_sd_configs:
                - role: endpoints
              relabel_configs:
                - source_labels: [__meta_kubernetes_namespace, __meta_kubernetes_service_name, __meta_kubernetes_endpoint_port_name]
                  action: keep
                  regex: default;public-web-api-dev;web

      hostmetrics:
        collection_interval: 10s
        scrapers:
          paging:
            metrics:
              system.paging.utilization:
                enabled: true
          cpu:
            metrics:
              system.cpu.utilization:
                enabled: true
          disk:
          filesystem:
            metrics:
              system.filesystem.utilization:
                enabled: true
          load:
          memory:
          network:
          processes:

    service:
      extensions:
        - health_check
        - memory_ballast
        - zpages
      pipelines:
        metrics: 
          receivers: [prometheus, hostmetrics]
          processors: [batch]
          exporters: [datadog]
        traces:
          receivers: 
            - otlp
          processors: 
            - batch
          exporters: 
            - datadog
      telemetry:
        logs:
          level: "debug"
---
apiVersion: monitoring.coreos.com/v1                                                                                                                                                                                                                                   
kind: ServiceMonitor                                                                                                                                                                                                                                                   
metadata:                                                                                                                                                                                                                                                              
  labels:                                                                                                                                                                                                                                                              
    app: public-web-api-dev
    release: opentelemetry-operator                                                                                                                                                                                                                                     
  name: public-web-api-dev                                                                                                                                                                                                                                     
  namespace: default                                                                                                                                                                                                                                                   
spec:                                                                                                                                                                                                                                                                  
  endpoints:                                                                                                                                                                                                                                                           
  - path: /metrics                                                                                                                                                                                                                                                     
    port: web                                                                                                                                                                                                                                                   

  selector:                                                                                                                                                                                                                                                            
    matchLabels:
      app: public-web-api-dev                                                                                                                                                                                                                                                       
      release: opentelemetry-operator

Screenshot 2022-12-07 at 12 40 00

github-actions[bot] commented 1 year ago

Pinging code owners for receiver/prometheus: @Aneurysm9 @dashpole. See Adding Labels via Comments if you do not have permissions to add labels yourself.

dashpole commented 1 year ago

ServiceMonitor is an object watched by the Prometheus Operator and OpenTelemetry operator. It looks like you might be using the OpenTelemetry operator. Can you share your OpenTelemetry collector object?

vivekcusat commented 1 year ago

My collector object

apiVersion: opentelemetry.io/v1alpha1
kind: OpenTelemetryCollector
metadata:
  name: my-collector
spec:
  mode: statefulset # This configuration is omittable.
  targetAllocator:
    enabled: true
    prometheusCR:
      enabled: true
      serviceMonitorSelector:
        release: prometheus-oprator
    replicas: 1

  config: |
    exporters:
      datadog:
        host_metadata:
          tags: [dev:abc]
        api:
          key: "API_KEY"
          site: "datadoghq.eu"

    extensions:
      health_check: {}
      memory_ballast: {}
      zpages: {}
    processors:
      batch: 
        send_batch_max_size: 1000
        send_batch_size: 100
        timeout: 10s
      resourcedetection:
        detectors: [env, gcp, ecs, ec2, azure, system]
        timeout: 5s
        override: false
      k8sattributes:

    receivers:
      otlp:
        protocols:
          http:
            endpoint: 0.0.0.0:4318
          grpc:
            endpoint: 0.0.0.0:4317

      fluentforward:
        endpoint: 0.0.0.0:8006

      prometheus:
        config:
          scrape_configs:
            - job_name: 'kubernetes-pods'
              scrape_interval: 10s
              honor_labels: true
              kubernetes_sd_configs:
                - role: endpoints
              relabel_configs:
                - source_labels: [__meta_kubernetes_endpoint_port_name]
                  regex: web
                  action: keep
                - source_labels: [__meta_kubernetes_service_name]
                  action: replace
                  target_label: service   

      hostmetrics:
        collection_interval: 10s
        scrapers:
          paging:
            metrics:
              system.paging.utilization:
                enabled: true
          cpu:
            metrics:
              system.cpu.utilization:
                enabled: true
          disk:
          filesystem:
            metrics:
              system.filesystem.utilization:
                enabled: true
          load:
          memory:
          network:
          processes:

    service:
      extensions:
        - health_check
        - memory_ballast
        - zpages
      pipelines:
        metrics: 
          receivers: [prometheus]
          processors: [batch]
          exporters: [datadog]
        traces:
          receivers: 
            - otlp
          processors: 
            - batch
          exporters: 
            - datadog
        logs:
          receivers: [fluentforward]
          processors: [batch]
          exporters: [datadog]
      telemetry:
        logs:
          level: "debug"
Aneurysm9 commented 1 year ago

I see that you have enabled the target allocator in your collector CR, which is the first step to making this work. Can you try to add the additional configuration found here to tell the collector how to discover new jobs created by the target allocator discovering ServiceMonitor resources?

jcdauchy-moodys commented 1 year ago

@vivekcusat did you notice that you have a typo in your allocator definition to select ServiceMonitor

serviceMonitorSelector: release: prometheus-oprator

github-actions[bot] commented 1 year ago

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

github-actions[bot] commented 1 year ago

This issue has been closed as inactive because it has been stale for 120 days with no activity.