open-telemetry / oteps

OpenTelemetry Enhancement Proposals
https://opentelemetry.io
Apache License 2.0
337 stars 164 forks source link

Environment Variable Resource Detection in Kubernetes #195

Closed dashpole closed 1 year ago

dashpole commented 2 years ago

This OTEP proposes using a consistent set of environment variables to detect resource attributes available through the Kubernetes downward API. The Kubernetes downward API is designed to serve this function and is the most consistent way to detect Kubernetes resource attributes. This will improve user experience, and enable re-use of the same Kubernetes configuration across languages and when using the collector.

dashpole commented 2 years ago

cc @Aneurysm9 @pmm-sumo @dmitryax who might be interested

pmm-sumo commented 2 years ago

I think that the downward API can currently also expose labels and annotations. While it's not included in the semantic conventions, k8sattributesprocessor can currently extract those into resource attributes (and this information is frequently very useful).

I am wondering if there's a space to include such capability via env resource detection. Currently, the labels/annotations are filtered by k8sattributesprocessor but perhaps we could include those in a separate namespace and let filtering happen (if needed) using e.g. resourceprocessor

dashpole commented 2 years ago

If there were different recognized variables we would probably switch to it though, as far as I can tell it's not practical to set that purely with YAML so having dedicated env vars seems worth it.

Oh, wow. I just realized you can define dependent env vars, which might be a better solution than i'm proposing.

I could add:

- name: OTEL_RESOURCE_ATTRIBUTES
  value: k8s.pod.name=$(K8S_POD_NAME),k8s.pod.uid=$(K8S_POD_UID),k8s.namespace.name=$(K8S_NAMESPACE_NAME),k8s.node.name=$(K8S_NODE_NAME)

to achieve the goals of this proposal, without requiring any detectors to be installed. I've added this as an alternative for now.

mat-rumian commented 2 years ago

@dashpole I would like to share my opinion. What do you think about configuring additional env var holding resource attributes related to k8s in separated var like:

- name: OTEL_RESOURCE_ATTRIBUTES_K8S
  value: k8s.pod.name=$(K8S_POD_NAME),k8s.pod.uid=$(K8S_POD_UID),k8s.namespace.name=$(K8S_NAMESPACE_NAME),k8s.node.name=$(K8S_NODE_NAME)

and use it this way:

- name: OTEL_RESOURCE_ATTRIBUTES
  value: $(OTEL_RESOURCE_ATTRIBUTES_K8S),other=attribs...

I think it will be simpler, transparent and easier to handle and modify.

dashpole commented 2 years ago

I think that the downward API can currently also expose labels and annotations. ... I am wondering if there's a space to include such capability via env resource detection.

If we went with the current proposal (explicitly defined env vars that are detected with a kubernetes detector), we could also support K8S_LABELS and K8S_ATTRIBUTES. The kubernetes detector could accept a configuration option to allow elevating parsed labels or attributes to resource attributes.

If we used dependent environment variables, as we are discussing above (no detector needed), users could just fetch labels and attributes themselves:

- name: MY_CUSTOM_ATTRIBUTE
   valueFrom:
     fieldRef:
       fieldPath: metadata.labels['my-custom-attribute']
- name: OTEL_RESOURCE_ATTRIBUTES
  value: other.semantic.convention=$(MY_CUSTOM_ATTRIBUTE)

There are a lot of labels/attributes that would come in if the resourcedetectionprocessor added them be default. Filtering later seems tedious (e.g. what if my cluster admin adds a new label)...

WDYT?

dashpole commented 2 years ago

What do you think about configuring additional env var holding resource attributes related to k8s in separated var like ...

That seems like a perfectly reasonable thing to do, practically speaking. In terms of this proposal, I think the core question is whether or not we want to specify a new resource detector. I'd prefer to keep the yaml simpler for readability, if thats acceptable.

bogdandrutu commented 2 years ago

@dashpole very nice OTEP, please consider to also refer to that proposal of supporting OTEL_RESOURCE_ATTRIBUTES_*

dashpole commented 2 years ago

please consider to also refer to that proposal of supporting OTEL_RESOURCEATTRIBUTES*

Done.

seh commented 2 years ago

On the subject of the difficulty of knitting together several environment variables into a final "OTEL_RESOURCE_ATTRIBUTES" variable, please see open-telemetry/opentelemetry-specification#1982.

Here's how I handled this in a kustomization I wrote recently, with the following in the "base" manifest for a Deployment:

- name: _OTEL_RESOURCES_ATTRIBUTES_UNDERLAY
  value: ""
- name: _OTEL_RESOURCES_ATTRIBUTES_OVERLAY
  value: ""
- name: OTEL_RESOURCE_ATTRIBUTES
  value: >-
    $(_OTEL_RESOURCES_ATTRIBUTES_UNDERLAY)
    k8s.container.name=my-thing,
    k8s.deployment.name=something,
    k8s.namespace.name=$(POD_NAMESPACE),
    k8s.node.name=$(NODE_NAME),
    k8s.pod.name=$(POD_NAME),
    k8s.pod.primary_ip_address=$(POD_IP_ADDRESS),
    k8s.pod.service_account.name=$(POD_SERVICE_ACCOUNT_NAME),
    k8s.pod.uid=$(POD_UID),
    net.host.ip=$(NODE_IP_ADDRESS)
    $(_OTEL_RESOURCES_ATTRIBUTES_OVERLAY)

Note the two empty "_OTEL_RESOURCES_ATTRIBUTES_UNDERLAY" and "_OTEL_RESOURCES_ATTRIBUTES_OVERLAY" variables. In overlay kustomizations, I can populate either or both of those, but I have to be careful to terminate the former with a trailing comma and prefix the latter with a leading comma.

dashpole commented 2 years ago

Thanks @seh, i've added that to the drawbacks of using dependent environment variables.