open-telemetry / opentelemetry-go-contrib

Collection of extensions for OpenTelemetry-Go.
https://opentelemetry.io/
Apache License 2.0
1.2k stars 562 forks source link

Request to Add Detector for Kubernetes #4136

Open dashpole opened 1 year ago

dashpole commented 1 year ago

Background

At one point, I had written an OTep about how kubernetes resource detectors should function, but it wasn't accepted due to lack of interest. https://github.com/open-telemetry/oteps/pull/195

I was recently reminded of the impact of this in https://github.com/open-telemetry/opentelemetry-go-contrib/pull/2310#issuecomment-1665703221.

Proposed Solution

Add a Kubernetes resource detector in detectors/kubernetes/. It will have a single implementation of the resource.Detector interface, and no configuration.

It will detect:

From:

If pod name is not discovered, fall back to the HOSTNAME env var. This defaults to the pod's name, but can be overridden in the pod spec.

If pod namespace is not discovered, fall back to reading /var/run/secrets/kubernetes.io/serviceaccount/namespace. This is always the pod's namespace AFAIK, but isn't a recommended way to get it.

Prior Art

OTel Operator: https://github.com/open-telemetry/opentelemetry-operator/blob/76cff59b0c0640da29c56d5ae91eae5fe843ae5b/pkg/constants/env.go#L25

Tasks

@jaredjenkins

shivanshuraj1333 commented 1 year ago

This is interesting, @dashpole may I pick up the work?

I recently added k8s.cluster.uid for k8sattribute processor in Otel Collector.

Would love to work on this feat as well, kindly assign :)

dashpole commented 1 year ago

Lets make sure maintainers are on-board first.

@open-telemetry/go-approvers please let me know if there are any objections to the proposed new component. I'm happy to maintain it long-term. It might be a good idea to have this in the spec first.

MrAlias commented 1 year ago

SGTM :+1:

dashpole commented 1 year ago

@anuraaga curious if you have feedback on the design above

aabmass commented 1 year ago

@dashpole should we introduce this to the spec? I don't think it should block this issue, but I would love to see a unified approach across all languages

anuraaga commented 1 year ago

Thanks @dashpole - for reference, here is my manifest config

        - name: OTEL_RESOURCE_K8S_NAMESPACE_NAME
          valueFrom:
            fieldRef:
              fieldPath: metadata.namespace
        - name: OTEL_RESOURCE_K8S_NODE_NAME
          valueFrom:
            fieldRef:
              fieldPath: spec.nodeName
        - name: OTEL_RESOURCE_K8S_POD_NAME
          valueFrom:
            fieldRef:
              fieldPath: metadata.name
        - name: OTEL_RESOURCE_K8S_CONTAINER_NAME
          valueFrom:
            fieldRef:
              # With our kustomization config, deployment name is always
              # same as container name, a pod otherwise has no way to
              # reference it.
              fieldPath: metadata.labels['deployment']
        - name: OTEL_RESOURCE_K8S_DEPLOYMENT_NAME
          valueFrom:
            fieldRef:
              fieldPath: metadata.labels['deployment']
        - name: OTEL_RESOURCE_ATTRIBUTES
          value: "k8s.namespace.name=$(OTEL_RESOURCE_K8S_NAMESPACE_NAME),k8s.node.name=$(OTEL_K8S_NODE_NAME),k8s.pod.name=$(OTEL_RESOURCE_K8S_POD_NAME),k8s.container.name=$(OTEL_RESOURCE_K8S_CONTAINER_NAME),k8s.deployment.name=$(OTEL_RESOURCE_K8S_DEPLOYMENT_NAME)"
        - name: OTEL_SERVICE_NAME
          valueFrom:
            fieldRef:
              fieldPath: metadata.labels['app']

(btw, I couldn't find a way to get container name with downstream API though I notice this issue description mentions it, wonder if I missed it)

This is indeed pretty annoying :) IIUC, the proposal would remove that OTEL_RESOURCE_ATTRIBUTES gnarly line, which is an improvement. However, I don't know how much of an improvement it is, skimming through the OTEP I saw one motivation is to have a copy-pastable snippet to get these filled. Unfortunately, the most important attribute service.name cannot be copy-pasted, so in the end there will be thinking required by the user. Similar for k8s.container.name, which while not as important, from what I've seen as a GCP user it does seem to be used in the UI in the tree navigation, so presumably is better to fill than not.

Another issue is while the OTEP mentioned issues with merge and telemetry schema related to the OTEL_RESOURCE_ATTRIBUTES variable, in practice I don't know if that variable is avoidable anyways, for example while my snippet doesn't have it yet, my next update will be adding the highly useful service.version, meaning I'm stuck with it anyways - at that point, it doesn't hurt much to have the k8s fields there as well.

So to sum up - if this detector existed, I would take advantage of the simplification by tweaking the variable names, which would be an improvement, but IMO not a big one. Given that this would need to be implemented in all the languages, there does seem to be low bang for the relatively large effort. As a user, personally I would be much happier if manpower, which I presume isn't unlimited, was spent on making sure all the instrumentation records metrics and more complete support for the SDK environment variables as while the variable setting is awkward, it's doable.

evantorrie commented 1 year ago

I understand how they are different (application internal vs. external collector), but in practice there does appear to be significant overlap between this and the k8sattributes processor in the OpenTelemetry Collector.

Meaning, how common is it that there are applications running in a kubernetes cluster where there is not already an OpenTelemetry Collector infrastructure deployed and collecting most of this metadata for all pods, regardless of language?

Is the use case more for applications that are sending directly to an external-to-cluster endpoint?

jaredjenkins commented 1 year ago

@evantorrie @anuraaga I can add some color here since this all started with a comment from me on @dashpole's PR. If you're coming from a Prometheus background, you really want an instance label that is more useful than a UUID. Traditionally, in Prometheus the instance ID is a hostname or a Pod name which makes debugging much easier than having to constantly join to the target_info gauge that OTEL-to-Prom currently supports. As an aside, having to pass in any concept of instance ID is also a big change for those coming from Prom.

What sparked this for me is that I saw that the GCP detector, provided a cluster. Traditionally, this is not always easy to obtain because K8s doesn't bake cluster into any of the API objects; it's not part of the data model. So I was hoping to stitch together this data like so:

detector := gcpdetector.NewDetector()
instanceID, err := kubernetesInstanceID(ctx, detector)
//...
res, err := resource.New(
    context.Background(),
    resource.WithDetectors(detector),
    // Keep the default detectors
    resource.WithTelemetrySDK(),
    // Add your own custom attributes to identify your application.
    resource.WithAttributes(
        semconv.ServiceNameKey.String("myservice"),
        semconv.ServiceInstanceIDKey.String(instanceID),
    ),
)

// To obtain a unique identifier for Kube pods.
func kubernetesInstanceID(ctx context.Context, d resource.Detector) (string, error) {
  res, err := d.Detect(ctx)
  if err != nil {
     return "", errors.WithMessage(err, "cannot detect GCP resources")
  }
  attrs := res.Set()
  cluster, ok := attrs.Value(semconv.K8SClusterNameKey)
  if !ok {
    return "", errors.New("missing cluster name")
  }
  ns, ok := attrs.Value(semconv.K8SNamespaceNameKey)
  if !ok {
    return "", errors.New("missing namespace name")
  }
  pod, ok := attrs.Value(semconv.K8SPodNameKey)
  if !ok {
    return "", errors.New("missing pod name")
  }
  return fmt.Sprint(cluster, "/", ns, "/", pod), nil
}

To @evantorrie's comments:

Injecting the Pod details into that instance label are significantly cheaper with environment variables compared to having another binary integrate with the k8s API. There are going to be cases where it makes a lot of sense (perhaps you need specific labels off of the Pod that tend to be more dynamic or if you want to grab metadata from related objects), but for the basic use case it's nice to have a useable instance ID.

jaredjenkins commented 1 year ago

But to @anuraaga's point, given that this still relies on env variables, it might not really provide a lot of value especially if you need every SDK to conform to this. I could easily pluck this data from the os.Env myself and make any instance ID that I want.

dashpole commented 1 year ago

So to sum up - if this detector existed, I would take advantage of the simplification by tweaking the variable names, which would be an improvement, but IMO not a big one.

Agreed. If you are already setting environment variables, it isn't much of an improvement. IMO, the main advantage is that even without setting any environment variables users would detect pod name and namespace name using fallbacks described above. That makes the "default" experience for less sophisticated users much better, as multiple pods within a deployment wouldn't have overlapping telemetry.

anuraaga commented 1 year ago

Thanks @jaredjenkins - for me entering this conversation was exactly that, wondering why my metrics don't have pod name and then finding this

https://github.com/open-telemetry/opentelemetry-go-contrib/blob/main/detectors/gcp/gke.go#L31

But as you confirmed, since this requires manifest changes anyways, and there will be non-k8s related resource attributes I need to set too, the impact feels low to me. It definitely needs to be in docs but not necessarily in a detector, for me at least. If everything could happen transparently/automatically with no manifest change, of course that would be a dream come true. So while it would take a while for a new version to become broadly usable, if a future k8s version provided such a metadata API, then it would be great (hint, hint in case anyone has friends in that org :) ).

anuraaga commented 1 year ago

even without setting any environment variables users would detect pod name and namespace name using fallbacks described above

Ah yeah, I assumed there is some fundamental problem with the fallbacks hence the deprecation of the gke detector I linked. If the non-manifest based detection works fine then I don't see an issue with it, mostly was commenting on the processing of the env vars since it seemed to be the main focus of the proposal.

dashpole commented 1 year ago

Ah yeah, I assumed there is some fundamental problem with the fallbacks hence the deprecation of the gke detector I linked.

The HOSTNAME env var defaults to the pod name in k8s, but can be overridden with pod.spec.hostname: https://kubernetes.io/docs/concepts/services-networking/dns-pod-service/#pod-s-hostname-and-subdomain-fields

The namespace detection should always work.

So they are pretty good, but not perfect.

dashpole commented 1 year ago

The HOSTNAME can also be changed with setHostnameAsFQDN

anuraaga commented 1 year ago

Cool - in that case my suggestion would be to focus this proposal on that, the generic k8s detector that magically detects the two attributes. As long as it doesn't override OTEL_RESOURCE_ATTRIBUTES, which with sdk autoconfiguration defaults I don't think it would, it would provide default values that can be overridden in those special cases, and indeed that does seem to be a significant UX improvement. While that variable does have some issues as the otep calls out, they don't seem to be k8s specific so only having these additional k8s variables is probably not worth it and would make any spec change that much harder as new variables often don't make it in, xref https://github.com/open-telemetry/opentelemetry-specification/pull/3573 https://github.com/open-telemetry/semantic-conventions/pull/261

dashpole commented 1 year ago

As long as it doesn't override OTEL_RESOURCE_ATTRIBUTES, which with sdk autoconfiguration defaults I don't think it would

would resource.Merge fail since the schema url of OTEL_RESOURCE_ATTRIBUTES can differ from the schema_url used by the detector? Or will that always succeed

anuraaga commented 1 year ago

Unfortunately I suspect it's going to be different among the SDKs as resource merging with different URLs isn't really well defined by the spec AFAIK. For example, Java will remove the schema

https://github.com/open-telemetry/opentelemetry-java/blob/666bef3019c322c07f3a7e9d7388e9351358ff2e/sdk/common/src/main/java/io/opentelemetry/sdk/resources/Resource.java#L167

That being said, IMO actually OTEL_RESOURCE_ATTRIBUTES should generally not have issues because it's user provided - so it should in principle not actually have a schema attached to it, or otherwise there should be a OTEL_RESOURCE_ATTRIBUTES_SCHEMA_URL to allow the user to define which schema they are working with. I filed this on Java to ask about that

https://github.com/open-telemetry/opentelemetry-java/issues/5729