open-telemetry / opentelemetry-collector-contrib

Contrib repository for the OpenTelemetry Collector
https://opentelemetry.io
Apache License 2.0
3.01k stars 2.33k forks source link

TargetAllocators - cannot discover services from two different serviceMonitor #35607

Closed flenoir closed 1 week ago

flenoir commented 1 week ago

Component(s)

receiver/prometheus

Describe the issue you're reporting

Hi,

Working with target allocators, i want to discover services from 2 distincts "serviceMonitors". when i read the doc,

i see "ServiceMonitors to be selected for target discovery. This is a map of {key,value} pairs. Each {key,value} in the map is going to exactly match a label in a ServiceMonitor's meta labels. The requirements are ANDed".

My config is :

spec:
  serviceAccount: my_sa
  targetAllocator:
    enabled: true
    serviceAccount: my_sa
    # allocationStrategy: "consistent-hashing"
    replicas: 2
    prometheusCR:
      enabled: true
      serviceMonitorSelector:
        release: first
        my-other-key: alpha

but is this case, i only point on service monitor that have both labels. I'd prefer to point on 2 differents serviceMonitor that have "release: first" for the firs one and "my-other-key: alpha" for the second one.

how can i have OR instead of AND ?

github-actions[bot] commented 1 week ago

Pinging code owners:

dashpole commented 1 week ago

Can you share the collector configuration that is generated? This may be a feature request for the opentelemetry-operator, rather than for the prometheus receiver.

flenoir commented 1 week ago

Hi @dashpole , yes, please find config below:

apiVersion: opentelemetry.io/v1beta1
kind: OpenTelemetryCollector
metadata:
  creationTimestamp: "2024-09-30T15:16:23Z"
  finalizers:
  - opentelemetrycollector.opentelemetry.io/finalizer
  generation: 64
  labels:
    app.kubernetes.io/managed-by: opentelemetry-operator
  name: otel-dp-collector-k8s
  namespace: my_nsamespace
  resourceVersion: "1605240555"
  uid: 04ca6338-27b0-4b6f-9cca-aa87a817e799
spec:
  config:
    exporters:
      logging: {}
      otlp:
        endpoint: http://ofusced.z1-dev.k8s.my-company.intra:80
        tls:
          insecure: true
      prometheusremotewrite/vm:
        compression: snappy
        endpoint: http://victoriametrics-obs.z6-prod-orl.k8s.my-company.intra/insert/0/prometheus/api/v1/write
        remote_write_queue:
          enabled: true
          num_consumers: 10
          queue_size: 50000
        resource_to_telemetry_conversion:
          enabled: true
        timeout: 15s
        tls:
          insecure: true
          insecure_skip_verify: true
    processors:
      batch:
        send_batch_max_size: 10000
        send_batch_size: 4000
        timeout: 60s
      k8sattributes:
        extract:
          metadata:
          - k8s.namespace.name
          - k8s.deployment.name
          - k8s.statefulset.name
          - k8s.daemonset.name
          - k8s.cronjob.name
          - k8s.job.name
          - k8s.node.name
          - k8s.pod.name
          - k8s.pod.uid
          - k8s.pod.start_time
        passthrough: false
        pod_association:
        - sources:
          - from: resource_attribute
            name: k8s.pod.ip
        - sources:
          - from: resource_attribute
            name: k8s.pod.uid
        - sources:
          - from: connection
      memory_limiter:
        check_interval: 1s
        limit_mib: 4000
        spike_limit_mib: 800
      resource:
        attributes:
        - action: insert
          key: k8s.cluster.name
          value: y1-dev
        - action: insert
          key: k8s.perimetre.name
          value: fab
    receivers:
      k8s_cluster:
        allocatable_types_to_report:
        - cpu
        - memory
        - storage
        auth_type: serviceAccount
        collection_interval: 60s
        metrics:
          k8s.node.condition:
            enabled: true
          k8s.pod.status_reason:
            enabled: true
        node_conditions_to_report:
        - Ready
        - MemoryPressure
        - DiskPressure
        - NetworkUnavailable
      k8s_events:
        auth_type: serviceAccount
      otlp:
        protocols:
          grpc:
            endpoint: 0.0.0.0:4317
          http:
            endpoint: 0.0.0.0:4318
      prometheus:
        config:
          scrape_configs:
          - job_name: otel-collector
            scrape_interval: 30s
            static_configs:
            - targets:
              - 0.0.0.0:8888
        target_allocator:
          collector_id: null
          endpoint: http://otel-dp-collector-k8s-targetallocator:80
          http_sd_config:
            refresh_interval: 60s
          interval: 30s
    service:
      pipelines:
        metrics:
          exporters:
          - logging
          - prometheusremotewrite/vm
          processors:
          - memory_limiter
          - k8sattributes
          - resource
          - batch
          receivers:
          - k8s_cluster
          - prometheus
  configVersions: 3
  daemonSetUpdateStrategy: {}
  deploymentUpdateStrategy: {}
  env:
  - name: K8S_NODE_NAME
    valueFrom:
      fieldRef:
        fieldPath: spec.nodeName
  - name: POD_NAME
    valueFrom:
      fieldRef:
        fieldPath: metadata.name
  image: docker-ghcr-proxy.repository.my-company.intra/open-telemetry/opentelemetry-collector-releases/opentelemetry-collector-contrib:0.109.0
  ingress:
    route: {}
  managementState: managed
  mode: statefulset
  observability:
    metrics: {}
  podDisruptionBudget:
    maxUnavailable: 1
  replicas: 2
  resources:
    limits:
      cpu: "2"
      memory: 10Gi
    requests:
      cpu: 200m
      memory: 512Mi
  serviceAccount: otel-dp-collector-k8s
  targetAllocator:
    allocationStrategy: consistent-hashing
    enabled: true
    filterStrategy: relabel-config
    observability:
      metrics: {}
    podDisruptionBudget:
      maxUnavailable: 1
    prometheusCR:
      enabled: true
      podMonitorSelector:
        matchLabels:
          value: nothing
      scrapeInterval: 30s
      serviceMonitorSelector:
        matchLabels:
          release: my-value
    replicas: 2
    resources:
      limits:
        cpu: 200m
        memory: 1Gi
      requests:
        cpu: 100m
        memory: 256Mi
    serviceAccount: otel-dp-collector-k8s-targetallocator
  upgradeStrategy: automatic
status:
  image: docker-ghcr-proxy.repository.my-company.intra/open-telemetry/opentelemetry-collector-releases/opentelemetry-collector-contrib:0.109.0
  scale:
    replicas: 2
    selector: app.kubernetes.io/component=opentelemetry-collector,app.kubernetes.io/instance=my-namespace.otel-dp-collector-k8s,app.kubernetes.io/managed-by=opentelemetry-operator,app.kubernetes.io/name=otel-dp-collector-k8s-collector,app.kubernetes.io/part-of=opentelemetry,app.kubernetes.io/version=0.109.0,tmj.fr/code-composant=oi314,tmj.fr/sous-composant=my-namespace-workloads
    statusReplicas: 2/2
  version: 0.102.1
dashpole commented 1 week ago

Yeah, I don't think there is anything we can do in the prometheus receiver. Please open an issue here: https://github.com/open-telemetry/opentelemetry-operator