open-telemetry / opentelemetry-collector-contrib

Contrib repository for the OpenTelemetry Collector
https://opentelemetry.io
Apache License 2.0
3.08k stars 2.38k forks source link

[processor/k8sattributes] attributes retrieved from related K8s objects not applied if original attributes contains empty value #36373

Open bacherfl opened 11 hours ago

bacherfl commented 11 hours ago

Component(s)

processor/k8sattributes

What happened?

Description

The k8sattributes processor currently checks if the original resource attributes contain a value for a given key, before applying an attribute. E.g. after retrieving the namespace information for a resource, it checks if the resource already has a value for k8s.namespace.name, before setting the value of that attribute to that of the related namespace: https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/c06be6df25ba21050827752f6e88b5054015a1a4/processor/k8sattributesprocessor/processor.go#L167

This avoids overwriting attributes that have been explicitly set previously. However, if one of the attributes (e.g. k8s.namespace.name) is set to an empty value, this also leads to the namespace not being set by the processor. Therefore I would like to raise the question if in this case, the processor should add the attribute value retrieved from the related k8s object, or if the original, empty value should also not be modified in such a case?

In my opinion, empty values should be treated the same way as non-existing values, and the processor should set the value, but I'd be interested in other opinions as well.

Steps to Reproduce

Deploy the collector in a K8s cluster with the following config

config.yaml

extensions:
      health_check:
        endpoint: 0.0.0.0:13133
    receivers:
      otlp:
        protocols:
          grpc:
            endpoint: ${env:MY_POD_IP}:4317
    processors:
      k8sattributes:
        extract:
          metadata:
          - k8s.pod.name
          - k8s.pod.uid
          - k8s.deployment.name
          - k8s.statefulset.name
          - k8s.daemonset.name
          - k8s.job.name
          - k8s.cronjob.name
          - k8s.namespace.name
          - k8s.node.name
          - k8s.cluster.uid
        pod_association:
        - sources:
          - from: resource_attribute
            name: k8s.pod.name
          - from: resource_attribute
            name: k8s.namespace.name
        - sources:
          - from: resource_attribute
            name: k8s.pod.ip
        - sources:
          - from: resource_attribute
            name: k8s.pod.uid
        - sources:
          - from: connection

    exporters:
      debug:
        verbosity: detailed
    service:
      extensions:
      - health_check
      pipelines:
        traces:
          receivers:
          - otlp
          processors:
          - k8sattributes
          - transform
          exporters:
          - debug

And send traffic to the collector, e.g. by creating a deployment with the telemetrygen cli, and add an empty k8s.namespace.name attribute to the generated traces:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: telemetrygen-deployment
  namespace: e2ek8senrichment
spec:
  replicas: 1
  selector:
    matchLabels:
      app: telemetrygen-deployment
  template:
    metadata:
      annotations:
        workload: deployment
      labels:
        app: telemetrygen-deployment
    spec:
      containers:
      - command:
        - /telemetrygen
        - traces
        - --otlp-insecure
        - --otlp-endpoint=otelcol.default.svc.cluster.local:4317
        - --duration=36000s
        - --rate=1
        - --otlp-attributes=service.name="test-trace-deployment"
        - --otlp-attributes=k8s.namespace.name=""
        image: ghcr.io/open-telemetry/opentelemetry-collector-contrib/telemetrygen:latest
        imagePullPolicy: IfNotPresent
        name: telemetrygen
      restartPolicy: Always

Expected Result

The exported traces should contain the resource attribute k8s.namespace.name=e2ek8senrichment

Actual Result

The exported traces contain an empty value for the attribute k8s.namespace.name

Collector version

v0.113.0

Environment information

Environment

OS: kind cluster on macOS Compiler(if manually compiled): (e.g., "go 14.2")

OpenTelemetry Collector configuration

extensions:
      health_check:
        endpoint: 0.0.0.0:13133
    receivers:
      otlp:
        protocols:
          grpc:
            endpoint: ${env:MY_POD_IP}:4317
    processors:
      k8sattributes:
        extract:
          metadata:
          - k8s.pod.name
          - k8s.pod.uid
          - k8s.deployment.name
          - k8s.statefulset.name
          - k8s.daemonset.name
          - k8s.job.name
          - k8s.cronjob.name
          - k8s.namespace.name
          - k8s.node.name
          - k8s.cluster.uid
        pod_association:
        - sources:
          - from: resource_attribute
            name: k8s.pod.name
          - from: resource_attribute
            name: k8s.namespace.name
        - sources:
          - from: resource_attribute
            name: k8s.pod.ip
        - sources:
          - from: resource_attribute
            name: k8s.pod.uid
        - sources:
          - from: connection

    exporters:
      debug:
        verbosity: detailed
    service:
      extensions:
      - health_check
      pipelines:
        traces:
          receivers:
          - otlp
          processors:
          - k8sattributes
          - transform
          exporters:
          - debug

Log output

No response

Additional context

I'm already working on a fix for this, and will create the PR if the code owners agree with the proposed change

github-actions[bot] commented 11 hours ago

Pinging code owners: